Top 23 C++ Machine Learning Projects

tensorflow

221 182,323 10.0 C++

An Open Source Machine Learning Framework for Everyone

Project mention: TensorFlow-metal on Apple Mac is junk for training | news.ycombinator.com | 2024-01-16

tesseract-ocr

120 57,866 8.9 C++

Tesseract Open Source OCR Engine (main repository)

Project mention: one of the Codia AI Design technologies: OCR Technology | dev.to | 2024-02-14

You will also need to install the Tesseract OCR engine, which can be downloaded and installed from the following link: https://github.com/tesseract-ocr/tesseract

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Caffe

6 33,859 0.0 C++

Caffe: a fast open framework for deep learning.

Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

Click to Learn more...

openpose

36 29,802 5.2 C++

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Project mention: AI "Artists" Are Lazy, and the Ultimate Goal of AI Image Generation (hint: its sloth) | /r/ArtistHate | 2023-11-25

Open Pose, a multi-person keypoint detection library for body, face, hands, and foot estimation [10], is used for posing generated characters;

C-Plus-Plus

3 29,048 7.1 C++

Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
xgboost

10 25,548 9.6 C++

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Project mention: XGBoost 2.0 | news.ycombinator.com | 2023-10-13

mediapipe

49 25,405 9.9 C++

Cross-platform, customizable ML solutions for live and streaming media.

Project mention: Mediapipe openpose Controlnet model for SD | /r/localdiffusion | 2023-11-15

mediapipe/docs/solutions/pose.md at master · google/mediapipe · GitHub

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
DeepSpeech

67 24,212 0.0 C++

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project mention: Common Voice | news.ycombinator.com | 2023-12-05

PaddlePaddle

6 21,584 10.0 C++

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

Click to Learn more...

CNTK

1 17,435 0.0 C++

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
LightGBM

11 16,043 9.2 C++

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Project mention: SIRUS.jl: Interpretable Machine Learning via Rule Extraction | /r/Julia | 2023-06-29

SIRUS.jl is a pure Julia implementation of the SIRUS algorithm by Bénard et al. (2021). The algorithm is a rule-based machine learning model meaning that it is fully interpretable. The algorithm does this by firstly fitting a random forests and then converting this forest to rules. Furthermore, the algorithm is stable and achieves a predictive performance that is comparable to LightGBM, a state-of-the-art gradient boosting model created by Microsoft. Interpretability, stability, and predictive performance are described in more detail below.

Dlib

33 13,011 7.9 C++

A toolkit for making real world machine learning and data analysis applications in C++

Project mention: Modern Image Processing Algorithms Implementation in C | news.ycombinator.com | 2023-06-06

onnxruntime

54 12,656 10.0 C++

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Project mention: Machine Learning with PHP | dev.to | 2024-04-22

ONNX Runtime: ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Open3D

11 10,485 8.6 C++

Open3D: A Modern Library for 3D Data Processing

Project mention: Does anyone else agree that the links to the latest development version of Open3D don't work? | /r/cscareerquestions | 2023-07-10

I was going to file a bug about another issue, but I have to download the development version. This is why I want this solved quickly. None of the links seem to work: https://github.com/isl-org/Open3D/issues/6259

vowpal_wabbit

11 8,400 8.1 C++

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
MNN

3 8,293 8.1 C++

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Project mention: [D][R] Deploying deep models on memory constrained devices | /r/MachineLearning | 2023-10-03

However, I am looking on this subject through the problem of training/finetuning deep models on the edge devices, being increasingly available thing to do. Looking at tflite, alibaba's MNN, mit-han-lab's tinyengine etc..

jetson-inference

11 7,323 8.5 C++

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
serving

12 6,070 9.8 C++

A flexible, high-performance serving system for machine learning models

Project mention: Llama.cpp: Full CUDA GPU Acceleration | news.ycombinator.com | 2023-06-12

Yet another TEDIOUS BATTLE: Python vs. C++/C stack.
This project gained popularity due to the HIGH DEMAND for running large models with 1B+ parameters, like `llama`. Python dominates the interface and training ecosystem, but prior to llama.cpp, non-ML professionals showed little interest in a fast C++ interface library. While existing solutions like tensorflow-serving [1] in C++ were sufficiently fast with GPU support, llama.cpp took the initiative to optimize for CPU and trim unnecessary code, essentially code-golfing and sacrificing some algorithm correctness for improved performance, which isn't favored by "ML research".
NOTE: In my opinion, a true pioneer was DarkNet, which implemented the YOLO model series and significantly outperformed others [2]. Same trick basically like llama.cpp
[1] https://github.com/tensorflow/serving

interpret

6 5,988 9.7 C++

Fit interpretable models. Explain blackbox machine learning.
tiny-cnn

1 5,763 0.0 C++

header only, dependency-free deep learning framework in C++14
oneflow

32 5,715 8.8 C++

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
pixie

19 5,273 9.4 C++

Instant Kubernetes-Native Application Observability

Project mention: Grafana Beyla: OSS eBPF auto-instrumentation for application observability | news.ycombinator.com | 2023-09-13

flashlight

16 5,145 7.7 C++

A C++ standalone library for machine learning (by flashlight)

Project mention: MatX: Efficient C++17 GPU numerical computing library with Python-like syntax | news.ycombinator.com | 2023-10-03

I think a comparison to PyTorch, TensorFlow and/or JAX is more relevant than a comparison to CuPy/NumPy.
And then maybe also a comparison to Flashlight (https://github.com/flashlight/flashlight) or other C/C++ based ML/computing libraries?
Also, there is no mention of it, so I suppose this does not support automatic differentiation?

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Machine Learning related posts

Show HN: I ported Suno AI's Bark model in C for fast realistic audio generation
1 project | news.ycombinator.com | 24 Apr 2024
Supabase Storage: now supports the S3 protocol
5 projects | news.ycombinator.com | 18 Apr 2024
Bark.cpp: Port of Suno AI's Bark in C/C++ for fast inference
1 project | news.ycombinator.com | 19 Apr 2024
Comparative Analysis of Memory Consumption: OpenMLDB vs Redis Test Report
1 project | dev.to | 3 Apr 2024
Ultra High-Performance Database OpenM(ysq)LDB: Seamless Compatibility with MySQL Protocol and Multi-Language MySQL Client
1 project | dev.to | 26 Mar 2024
Mastering Distributed Database Development in 10 Minutes with OpenMLDB Developer Docker Image
1 project | dev.to | 13 Mar 2024
Why do tree-based models still outperform deep learning on tabular data? (2022)
3 projects | news.ycombinator.com | 5 Mar 2024
A note from our sponsor - WorkOS
workos.com | 24 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →

Index

What are some of the best open-source Machine Learning projects in C++? This list will help you:

	Project	Stars
1	tensorflow	182,323
2	tesseract-ocr	57,866
3	Caffe	33,859
4	openpose	29,802
5	C-Plus-Plus	29,048
6	xgboost	25,548
7	mediapipe	25,405
8	DeepSpeech	24,212
9	PaddlePaddle	21,584
10	CNTK	17,435
11	LightGBM	16,043
12	Dlib	13,011
13	onnxruntime	12,656
14	Open3D	10,485
15	vowpal_wabbit	8,400
16	MNN	8,293
17	jetson-inference	7,323
18	serving	6,070
19	interpret	5,988
20	tiny-cnn	5,763
21	oneflow	5,715
22	pixie	5,273
23	flashlight	5,145