Top 23 C++ GPU Projects

taichi

36 24,760 9.1 C++

Productive, portable, and performant GPU programming in Python.

Project mention: This Week In Python | dev.to | 2024-03-17

taichi – Productive, portable, and performant GPU programming in Python

Open3D

11 10,485 8.6 C++

Open3D: A Modern Library for 3D Data Processing

Project mention: Does anyone else agree that the links to the latest development version of Open3D don't work? | /r/cscareerquestions | 2023-07-10

I was going to file a bug about another issue, but I have to download the development version. This is why I want this solved quickly. None of the links seem to work: https://github.com/isl-org/Open3D/issues/6259

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
cudf

23 7,274 9.9 C++

cuDF - GPU DataFrame Library

Project mention: A Polars exploration into Kedro | dev.to | 2023-05-17

The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.

Halide

43 5,700 9.5 C++

a language for fast, portable data-parallel computation

Project mention: Show HN: Flash Attention in ~100 lines of CUDA | news.ycombinator.com | 2024-03-16

If CPU/GPU execution speed is the goal while simultaneously code golfing the source size, https://halide-lang.org/ might have come in handy.

meshoptimizer

12 4,959 9.0 C++

Mesh optimization library that makes meshes smaller and faster to render
DALI

5 4,914 9.6 C++

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Project mention: [D] Will data augmentations work faster on TPUs? | /r/MachineLearning | 2023-12-07

Another option is DALI https://github.com/NVIDIA/DALI For my project while training EfficientNet2, it was a game changer. But it a way harder to implement in code than TorchVision or Kornia.

MegEngine

5 4,713 9.0 C++

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
cutlass

16 4,522 8.8 C++

CUDA Templates for Linear Algebra Subroutines

Project mention: Optimization Techniques for GPU Programming [pdf] | news.ycombinator.com | 2023-08-09

I would recommend the course from Oxford (https://people.maths.ox.ac.uk/gilesm/cuda/). Also explore the tutorial section of cutlass (https://github.com/NVIDIA/cutlass/blob/main/media/docs/cute/...) if you want to learn more about high performance gemm.

ArrayFire

6 4,404 7.8 C++

ArrayFire: a general purpose GPU library.

Project mention: Learn WebGPU | news.ycombinator.com | 2023-04-27

Loads of people have stated why easy GPU interfaces are difficult to create, but we solve many difficult things all the time.
Ultimately I think CPUs are just satisfactory for the vast vast majority of workloads. Servers rarely come with any GPUs to speak of. The ecosystem around GPUs is unattractive. CPUs have SIMD instructions that can help. There are so many reasons not to use GPUs. By the time anyone seriously considers using GPUs they're, in my imagination, typically seriously starved for performance, and looking to control as much of the execution details as possible. GPU programmers don't want an automagic solution.
So I think the demand for easy GPU interfaces is just very weak, and therefore no effort has taken off. The amount of work needed to make it as easy to use as CPUs is massive, and the only reason anyone would even attempt to take this on is to lock you in to expensive hardware (see CUDA).
For a practical suggestion, have you taken a look at https://arrayfire.com/ ? It can run on both CUDA and OpenCL, and it has C++, Rust and Python bindings.

cuml

10 3,894 9.3 C++

cuML - RAPIDS Machine Learning Library

Project mention: FLaNK Stack Weekly for 13 November 2023 | dev.to | 2023-11-13

tiny-cuda-nn

9 3,379 6.3 C++

Lightning fast C++/CUDA neural network framework
FluidX3D

53 3,193 8.6 C++

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL.

Project mention: FluidX3D | news.ycombinator.com | 2024-03-24

heavydb

1 2,902 8.4 C++

HeavyDB (formerly OmniSciDB)
deepdetect

4 2,493 7.0 C++

Deep Learning API and Server in C++14 support for Caffe, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Project mention: Exploring Open-Source Alternatives to Landing AI for Robust MLOps | dev.to | 2023-12-13

For those seeking a lightweight solution for setting up deep learning REST APIs across platforms without the complexity of Kubernetes, Deepdetect is worth considering.

CV-CUDA

1 2,190 5.6 C++

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
GLSL-PathTracer

1 1,732 3.1 C++

A toy physically based GPU path tracer (C++/OpenGL/GLSL)
Boost.Compute

0 1,497 0.0 C++

A C++ GPU Computing Library for OpenCL
rpi-vk-driver

3 1,219 0.0 C++

VK driver for the Raspberry Pi (Broadcom Videocore IV)
marian

3 1,167 0.0 C++

Fast Neural Machine Translation in C++
MatX

7 1,115 9.1 C++

An efficient C++17 GPU numerical computing library with Python-like syntax

Project mention: An efficient C++17 GPU numerical computing library with Python-like syntax | /r/programming | 2023-10-05

stdgpu

0 1,085 7.1 C++

stdgpu: Efficient STL-like Data Structures on the GPU
compute-runtime

58 1,063 10.0 C++

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver

Project mention: Intel Graphics Compute Runtime for OneAPI Level Zero and OpenCL | news.ycombinator.com | 2023-08-02

tgfx

1 1,001 9.5 C++

A lightweight 2D graphics library for rendering texts, geometries, and images with high-performance APIs that work across various platforms.

Project mention: TGFX – A Skia-alternative, lightweight, high-performance 2D graphics library | news.ycombinator.com | 2023-11-07

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ GPU related posts

FluidX3D
1 project | news.ycombinator.com | 24 Mar 2024
Show HN: Flash Attention in ~100 lines of CUDA
2 projects | news.ycombinator.com | 16 Mar 2024
Taichi: Accessible GPU programming, embedded in Python
1 project | news.ycombinator.com | 11 Mar 2024
Halide v17.0.0
1 project | news.ycombinator.com | 1 Feb 2024
Earthquake in Japan yesterday may have shifted land 1.3 meters
1 project | news.ycombinator.com | 2 Jan 2024
Implementing Mario's Stack Blur 15 times in C++ (with tests and benchmarks)
1 project | news.ycombinator.com | 10 Nov 2023
An efficient C++17 GPU numerical computing library with Python-like syntax
1 project | /r/programming | 5 Oct 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 25 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source GPU projects in C++? This list will help you:

	Project	Stars
1	taichi	24,760
2	Open3D	10,485
3	cudf	7,274
4	Halide	5,700
5	meshoptimizer	4,959
6	DALI	4,914
7	MegEngine	4,713
8	cutlass	4,522
9	ArrayFire	4,404
10	cuml	3,894
11	tiny-cuda-nn	3,379
12	FluidX3D	3,193
13	heavydb	2,902
14	deepdetect	2,493
15	CV-CUDA	2,190
16	GLSL-PathTracer	1,732
17	Boost.Compute	1,497
18	rpi-vk-driver	1,219
19	marian	1,167
20	MatX	1,115
21	stdgpu	1,085
22	compute-runtime	1,063
23	tgfx	1,001