Open-source C++ projects categorized as GPU

Top 23 C++ GPU Projects

  • taichi

    Productive & portable high-performance programming in Python.

    Project mention: Taichi v1.5.0 Released! See what's new👇 | reddit.com/r/taichi_lang | 2023-04-17

    Check our the realease note (https://github.com/taichi-dev/taichi/releases) for more improvements.

  • Open3D

    Open3D: A Modern Library for 3D Data Processing

    Project mention: Import many photogrammetry software's scenes into Blender | reddit.com/r/photogrammetry | 2023-03-26

    Open3D (JSON, LOG, PLY) 1

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • cudf

    cuDF - GPU DataFrame Library

    Project mention: A Polars exploration into Kedro | dev.to | 2023-05-17

    The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.

  • Halide

    a language for fast, portable data-parallel computation

    Project mention: Two-tier programming language | reddit.com/r/ProgrammingLanguages | 2023-04-19

    Halide https://halide-lang.org/

  • Thrust

    The C++ parallel algorithms library.

    Project mention: Parallel Computations in C++: Where Do I Begin? | reddit.com/r/learnprogramming | 2022-09-23

    For a higher level GPU interface, Thrust provides "standard library"-like functions that run in parallel on the GPU (Nvidia only)

  • MegEngine

    MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

  • DALI

    A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

    Project mention: DirectStorage - Loading data to GPU *directly* from the SSD drive, almost without using CPU | reddit.com/r/deeplearning | 2023-05-07

    Check out https://github.com/nvidia/DALI


    ONLYOFFICE Docs — document collaboration in your environment. Powerful document editing and collaboration in your app or environment. Ultimate security, API and 30+ ready connectors, SaaS or on-premises

  • meshoptimizer

    Mesh optimization library that makes meshes smaller and faster to render

    Project mention: Nanite-like LODs experiments | reddit.com/r/GraphicsProgramming | 2023-04-12

    1) I used meshoptimizer to simplify meshes and building meshlets. But to combine these meshlets (for more efficient simplifying) I built a graph with weights and partition it. In this weight some metrics are included: the shared border length of a pair of meshlets, their facing direction and other similar metrics. The next step is to simplify these groups of meshlets (fixate borders for each of them and simplify meshlets within the group). So, you can make dependencies of newly generated meshlets on the initial ones.

  • ArrayFire

    ArrayFire: a general purpose GPU library.

    Project mention: Learn WebGPU | news.ycombinator.com | 2023-04-27

    Loads of people have stated why easy GPU interfaces are difficult to create, but we solve many difficult things all the time.

    Ultimately I think CPUs are just satisfactory for the vast vast majority of workloads. Servers rarely come with any GPUs to speak of. The ecosystem around GPUs is unattractive. CPUs have SIMD instructions that can help. There are so many reasons not to use GPUs. By the time anyone seriously considers using GPUs they're, in my imagination, typically seriously starved for performance, and looking to control as much of the execution details as possible. GPU programmers don't want an automagic solution.

    So I think the demand for easy GPU interfaces is just very weak, and therefore no effort has taken off. The amount of work needed to make it as easy to use as CPUs is massive, and the only reason anyone would even attempt to take this on is to lock you in to expensive hardware (see CUDA).

    For a practical suggestion, have you taken a look at https://arrayfire.com/ ? It can run on both CUDA and OpenCL, and it has C++, Rust and Python bindings.

  • cuml

    cuML - RAPIDS Machine Learning Library

    Project mention: Is it possible to run Sklearn models on a GPU? | reddit.com/r/datascience | 2023-03-05

    sklearn can't, bit take a look at cuML (https://github.com/rapidsai/cuml ). It uses the same API as sklearn but executes on GPU.

  • cutlass

    CUDA Templates for Linear Algebra Subroutines

    Project mention: Want to understand INT8 better | reddit.com/r/CUDA | 2023-05-03

    The latter (and I guess you were asking about this one) is designed to accelerate NN inference in reduced precision. It is possible to use Tensor Cores for you own purposes, mainly through CUTLASS. But because Tensor Cores are designed to execute matrix multiplications, it can be hard to adapt your problem to them. The performance with them is insane (IIRC 32x the performance of the INT32 pipeline), but only for matrix multiplication…

  • heavydb

    HeavyDB (formerly OmniSciDB)

  • tiny-cuda-nn

    Lightning fast C++/CUDA neural network framework

    Project mention: [D] Have their been any attempts to create a programming language specifically for machine learning? | reddit.com/r/MachineLearning | 2023-02-11

    In the opposite direction from your question is a very interesting project, TinyNN all implemented as close to the metal as possible and very fast: https://github.com/NVlabs/tiny-cuda-nn

  • deepdetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

    Project mention: [D] Deep Learning Framework for C++. | reddit.com/r/MachineLearning | 2022-06-12

    But you need to have good reasons to do it. Ours is that we have a multi-backend framework, and that we don't want any step in between dev & run. C++ allows for this since the same code can run on training server and edge device as needed. It also allows for building full AI applicatioms with great performances (e g. real time) We dev & use https://github.com/jolibrain/deepdetect for these purposes and it serves us very well, but it's not the faint of heart !

  • libcudacxx

    The C++ Standard Library for your entire system.


    CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

    Project mention: Microsoft, Tencent, Baidu Adopting Nvidia CV-CUDA for Computer Vision AI | news.ycombinator.com | 2023-03-21

    I'm not familiar with CV-CUDA but it looks interesting.The github may be made useful than the press release: https://github.com/CVCUDA/CV-CUDA

  • GLSL-PathTracer

    A GLSL Path Tracer

  • Boost.Compute

    A C++ GPU Computing Library for OpenCL

  • rpi-vk-driver

    VK driver for the Raspberry Pi (Broadcom Videocore IV)

    Project mention: Failed to open Light Display Manager when booting | reddit.com/r/raspberry_pi | 2022-09-04

    I was trying to install Vulkan on my Pi3B+ following this link

  • marian

    Fast Neural Machine Translation in C++

    Project mention: [P] A CLI tool for easy transformer sequence classifier training and inference | reddit.com/r/MachineLearning | 2023-02-01

    As a reference, I forked https://github.com/marian-nmt/marian privately to support sequence tagging tasks. With a positional loss mask, It can also support sequence classificaiton.

  • compute-runtime

    Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver

    Project mention: Vladmandic Stable Diffusion added Intel ARC GPU support on Linux | reddit.com/r/IntelArc | 2023-05-01

    Update: I was able to fix my issue. I'm using Ubuntu 22.04.2 LTS and have the newest available kernel, 6.3.1. Installing the drivers via apt does not work, instead I needed to use https://github.com/intel/compute-runtime/releases/

  • stdgpu

    stdgpu: Efficient STL-like Data Structures on the GPU

  • CLBlast

    Tuned OpenCL BLAS

    Project mention: OpenCL in Termux | reddit.com/r/termux | 2023-06-02

    Install CLBlast: cd git clone https://github.com/CNugteren/CLBlast.git cd CLBlast cmake -B build \ -DBUILD_SHARED_LIBS=OFF \ -DTUNERS=OFF \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=/data/data/com.termux/files/usr cd build make -j8 make install

  • Sonar

    Write Clean C++ Code. Always.. Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-06-02.

C++ GPU related posts


What are some of the best open-source GPU projects in C++? This list will help you:

Project Stars
1 taichi 23,170
2 Open3D 8,951
3 cudf 5,540
4 Halide 5,420
5 Thrust 4,565
6 MegEngine 4,532
7 DALI 4,424
8 meshoptimizer 4,195
9 ArrayFire 4,145
10 cuml 3,350
11 cutlass 2,817
12 heavydb 2,801
13 tiny-cuda-nn 2,467
14 deepdetect 2,457
15 libcudacxx 2,228
16 CV-CUDA 1,661
17 GLSL-PathTracer 1,471
18 Boost.Compute 1,411
19 rpi-vk-driver 1,201
20 marian 1,036
21 compute-runtime 928
22 stdgpu 913
23 CLBlast 731
Write Clean C++ Code. Always.
Sonar helps you commit clean C++ code every time. With over 550 unique rules to find C++ bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.