SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 C++ GPU Projects
-
Project mention: Taichi: Productive, portable, and performant GPU programming in Python | news.ycombinator.com | 2024-08-20
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
-
Project mention: Unleashing GPU Power: Supercharge Your Data Processing with cuDF | dev.to | 2024-06-21
cuDF Documentation
-
catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Project mention: 🚀 Why Your ML Service Needs Rust + CatBoost: A Setup Guide That Actually Works | dev.to | 2025-01-19[package] name = "MLApp" version = "0.1.0" edition = "2021" [dependencies] catboost = { git = "https://github.com/catboost/catboost", rev = "0bfdc35"}
-
Years ago I started a collection of convolution optimization resources: https://github.com/mratsim/laser/wiki/Convolution-optimisati...
Also checked and apparently Nvidia Cutlass now supports generic convolutions: https://github.com/NVIDIA/cutlass
-
> Making a nanite mesh is complicated, with a lot of internal offsets for linking, and so far only Unreal Engine's editor does it.
meshoptimizer [1] is an OSS implementation of meshlet generation, which is what most people think of when they think of "Nanite's algorithm". Bevy, mentioned in a sibling reply, uses meshoptimizer as the generation tool.
(Strictly speaking, "Nanite" is a brand name that encompasses a large collection of techniques, including meshlets, software rasterization, streaming, etc. For clarity during technical discussions, I prefer to talk about individual techniques, since they're really separate, even though they complement one another. For example, software rasterization can be used without meshlets if your triangles are really small. Streaming can be useful even if you aren't using meshlets. And so on.)
[1]: https://github.com/zeux/meshoptimizer
-
> Hence it becomes a game of scheduling. You already know what you need to optimise but actually doing so gets really hard really fast.
This immediately makes me think of Halide, which was specifically invented to make this easier to do by decoupling the algorithm from the scheduler.
Kind of sad that it doesn't see to have caught on much.
[0] https://halide-lang.org/
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
How to Accomplish: Use a combination of geometric transformations (e.g., rotation, scaling, cropping, flipping), color space adjustments (e.g., brightness, contrast, saturation), and other techniques (e.g., noise injection, blurring, cutout). Libraries such as ImgAug, DeepMind Augmentation, Albumentations, and NVIDIA DALI offer a wide range of ready-to-use augmentation techniques that can introduce the necessary diversity into your dataset.
-
-
-
-
FluidX3D
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
-
-
-
Did not know executorch existed! That's so cool! I have it on my bucket list to tinker with running LLMs on wearables after I'm a little further along in learning, great to see official tooling for that!
https://github.com/pytorch/executorch
-
deepdetect
Deep Learning API and Server in C++14 support for PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE
-
CV-CUDA
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
-
-
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C++ GPU discussion
C++ GPU related posts
-
Robotics and ROS2 Course by University of Eastern Finland
-
The Missing Nvidia GPU Glossary
-
Halide – a language for fast, portable computation on images and tensors
-
FluidX3D
-
The Slang Shading Language
-
The Success and Failure of Ninja (2020)
-
Halide: A language for fast, portable computation on images and tensors
-
A note from our sponsor - SaaSHub
www.saashub.com | 27 Mar 2025
Index
What are some of the best open-source GPU projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | taichi | 26,902 |
2 | Open3D | 12,105 |
3 | cudf | 8,809 |
4 | catboost | 8,300 |
5 | cutlass | 7,168 |
6 | meshoptimizer | 6,096 |
7 | Halide | 6,007 |
8 | DALI | 5,331 |
9 | MegEngine | 4,786 |
10 | ArrayFire | 4,654 |
11 | cuml | 4,547 |
12 | FluidX3D | 4,306 |
13 | tiny-cuda-nn | 3,933 |
14 | heavydb | 2,975 |
15 | executorch | 2,637 |
16 | deepdetect | 2,528 |
17 | CV-CUDA | 2,471 |
18 | GLSL-PathTracer | 1,904 |
19 | Boost.Compute | 1,591 |
20 | cccl | 1,553 |
21 | MatX | 1,298 |
22 | marian | 1,286 |
23 | rpi-vk-driver | 1,232 |