SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 C++ Cuda Projects
-
Project mention: Does anyone else agree that the links to the latest development version of Open3D don't work? | /r/cscareerquestions | 2023-07-10
I was going to file a bug about another issue, but I have to download the development version. This is why I want this solved quickly. None of the links seem to work: https://github.com/isl-org/Open3D/issues/6259
-
The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
-
Project mention: Optimization Techniques for GPU Programming [pdf] | news.ycombinator.com | 2023-08-09
I would recommend the course from Oxford (https://people.maths.ox.ac.uk/gilesm/cuda/). Also explore the tutorial section of cutlass (https://github.com/NVIDIA/cutlass/blob/main/media/docs/cute/...) if you want to learn more about high performance gemm.
-
Loads of people have stated why easy GPU interfaces are difficult to create, but we solve many difficult things all the time.
Ultimately I think CPUs are just satisfactory for the vast vast majority of workloads. Servers rarely come with any GPUs to speak of. The ecosystem around GPUs is unattractive. CPUs have SIMD instructions that can help. There are so many reasons not to use GPUs. By the time anyone seriously considers using GPUs they're, in my imagination, typically seriously starved for performance, and looking to control as much of the execution details as possible. GPU programmers don't want an automagic solution.
So I think the demand for easy GPU interfaces is just very weak, and therefore no effort has taken off. The amount of work needed to make it as easy to use as CPUs is massive, and the only reason anyone would even attempt to take this on is to lock you in to expensive hardware (see CUDA).
For a practical suggestion, have you taken a look at https://arrayfire.com/ ? It can run on both CUDA and OpenCL, and it has C++, Rust and Python bindings.
-
-
Project mention: Hip: Runtime API and Kernel Language for Portable Apps for AMD and Nvidia GPUs | news.ycombinator.com | 2024-03-10
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
-
Project mention: Show HN: Demo of Agent Based Model on GPU with CUDA and OpenGL (Windows/Linux) | news.ycombinator.com | 2023-12-04
-
-
-
Project mention: Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller | news.ycombinator.com | 2023-10-31
Just a point of clarification - faster-whisper references it but ctranslate2[0] is what's really doing the magic here.
Ctranslate2 is a sleeper powerhouse project that enables a lot. They should be up front and center and get the credit they deserve.
-
Project mention: Calyx, a Compiler Infrastructure for Accelerator Generators | news.ycombinator.com | 2024-03-04
How is this different than the mlir infrastructure of llvm and xla implemented in https://iree.dev/?
-
CV-CUDA
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
-
-
Project mention: [P] - VkFFT now supports quad precision (double-double) FFT computation on GPU | /r/MachineLearning | 2023-09-27
Hello, I am the creator of the VkFFT - GPU Fast Fourier Transform library for Vulkan/CUDA/HIP/OpenCL/Level Zero and Metal. In the latest update, I have added support for quad-precision double-double emulation for FFT calculation on most modern GPUs. I understand that modern ML is going in the opposite low-precision direction, but I still think that it may be useful to have this functionality at least for some prototyping and development of concepts.
-
nitro
An inference server on top of llama.cpp. OpenAI-compatible API, queue, & scaling. Embed a prod-ready, local inference engine in your apps. Powers Jan (by janhq)
I'd like to see a comparison to nitro https://github.com/janhq/nitro which has been fantastic for running a local LLM.
-
-
Project mention: An efficient C++17 GPU numerical computing library with Python-like syntax | /r/programming | 2023-10-05
-
-
vuda
VUDA is a header-only library based on Vulkan that provides a CUDA Runtime API interface for writing GPU-accelerated applications.
-
Project mention: For those interested in learning how to build a Language Identification solution using PyTorch, check out my article. | /r/learnmachinelearning | 2023-04-28
Link to code sample: https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/End-to-end-Workloads/LanguageIdentification
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C++ Cuda related posts
- Open-source project ZLUDA lets CUDA apps run on AMD GPUs
- Optimization Example: Mandelbrot Set (part 1)
- AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source
- Nvidia Is Now More Valuable Than Amazon and Google
- Debian on Apple hardware (M1 and later)
- Running pre-trained ML models in Godot
- AMD's CDNA 3 Compute Architecture
-
A note from our sponsor - SaaSHub
www.saashub.com | 29 Mar 2024
Index
What are some of the best open-source Cuda projects in C++? This list will help you:
Project | Stars | |
---|---|---|
1 | Open3D | 10,337 |
2 | cudf | 7,163 |
3 | oneflow | 5,687 |
4 | cutlass | 4,401 |
5 | ArrayFire | 4,383 |
6 | cuml | 3,859 |
7 | HIP | 3,410 |
8 | tiny-cuda-nn | 3,335 |
9 | alien | 3,243 |
10 | lightseq | 3,061 |
11 | heavydb | 2,893 |
12 | CTranslate2 | 2,667 |
13 | iree | 2,337 |
14 | CV-CUDA | 2,109 |
15 | slang | 1,587 |
16 | VkFFT | 1,432 |
17 | nitro | 1,425 |
18 | marian | 1,144 |
19 | MatX | 1,104 |
20 | stdgpu | 1,066 |
21 | vuda | 826 |
22 | oneAPI-samples | 810 |
23 | GPU-Raytracer | 744 |