C++ Machine Learning

Open-source C++ projects categorized as Machine Learning

Top 23 C++ Machine Learning Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

    Project mention: TensorFlow-metal on Apple Mac is junk for training | news.ycombinator.com | 2024-01-16
  • tesseract-ocr

    Tesseract Open Source OCR Engine (main repository)

    Project mention: one of the Codia AI Design technologies: OCR Technology | dev.to | 2024-02-14

    You will also need to install the Tesseract OCR engine, which can be downloaded and installed from the following link: https://github.com/tesseract-ocr/tesseract

  • JetBrains Dev Survey

    What’s up with the C++ ecosystem in 2023? JetBrains Developer Ecosystem Survey 2023 has given us many interesting insights. The Embedded (37%) and Games (39%) industries are already working with C++20, developers are incorporating static analysis in their CI, and ChatGPT usage among coders is flourishing. Read on for more!

  • Caffe

    Caffe: a fast open framework for deep learning.

    Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

    Click to Learn more...

  • openpose

    OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

    Project mention: AI "Artists" Are Lazy, and the Ultimate Goal of AI Image Generation (hint: its sloth) | /r/ArtistHate | 2023-11-25

    Open Pose, a multi-person keypoint detection library for body, face, hands, and foot estimation [10], is used for posing generated characters;

  • C-Plus-Plus

    Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.

  • xgboost

    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

    Project mention: XGBoost 2.0 | news.ycombinator.com | 2023-10-13
  • mediapipe

    Cross-platform, customizable ML solutions for live and streaming media.

    Project mention: Mediapipe openpose Controlnet model for SD | /r/localdiffusion | 2023-11-15

    mediapipe/docs/solutions/pose.md at master · google/mediapipe · GitHub

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • DeepSpeech

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

    Project mention: Common Voice | news.ycombinator.com | 2023-12-05
  • PaddlePaddle

    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

    Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

    Click to Learn more...

  • CNTK

    Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

  • LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

    Project mention: SIRUS.jl: Interpretable Machine Learning via Rule Extraction | /r/Julia | 2023-06-29

    SIRUS.jl is a pure Julia implementation of the SIRUS algorithm by Bénard et al. (2021). The algorithm is a rule-based machine learning model meaning that it is fully interpretable. The algorithm does this by firstly fitting a random forests and then converting this forest to rules. Furthermore, the algorithm is stable and achieves a predictive performance that is comparable to LightGBM, a state-of-the-art gradient boosting model created by Microsoft. Interpretability, stability, and predictive performance are described in more detail below.

  • Dlib

    A toolkit for making real world machine learning and data analysis applications in C++

    Project mention: Modern Image Processing Algorithms Implementation in C | news.ycombinator.com | 2023-06-06
  • onnxruntime

    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

    Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • Open3D

    Open3D: A Modern Library for 3D Data Processing

    Project mention: Does anyone else agree that the links to the latest development version of Open3D don't work? | /r/cscareerquestions | 2023-07-10

    I was going to file a bug about another issue, but I have to download the development version. This is why I want this solved quickly. None of the links seem to work: https://github.com/isl-org/Open3D/issues/6259

  • vowpal_wabbit

    Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

    Project mention: Data Science terminology can be wild | /r/datascience | 2023-03-16

    Let me introduce you to my friend Vopal Wabbit. https://vowpalwabbit.org/

  • MNN

    MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

    Project mention: [D][R] Deploying deep models on memory constrained devices | /r/MachineLearning | 2023-10-03

    However, I am looking on this subject through the problem of training/finetuning deep models on the edge devices, being increasingly available thing to do. Looking at tflite, alibaba's MNN, mit-han-lab's tinyengine etc..

  • jetson-inference

    Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

    Project mention: Can this NVIDIA Jetson Nano handle advanced machine learning tasks? | /r/NvidiaJetson | 2023-03-18

    Jetson Nano’s are obsolete and no longer supported; but to answer your question, this might be a good place to start.

  • serving

    A flexible, high-performance serving system for machine learning models

    Project mention: Llama.cpp: Full CUDA GPU Acceleration | news.ycombinator.com | 2023-06-12

    Yet another TEDIOUS BATTLE: Python vs. C++/C stack.

    This project gained popularity due to the HIGH DEMAND for running large models with 1B+ parameters, like `llama`. Python dominates the interface and training ecosystem, but prior to llama.cpp, non-ML professionals showed little interest in a fast C++ interface library. While existing solutions like tensorflow-serving [1] in C++ were sufficiently fast with GPU support, llama.cpp took the initiative to optimize for CPU and trim unnecessary code, essentially code-golfing and sacrificing some algorithm correctness for improved performance, which isn't favored by "ML research".

    NOTE: In my opinion, a true pioneer was DarkNet, which implemented the YOLO model series and significantly outperformed others [2]. Same trick basically like llama.cpp

    [1] https://github.com/tensorflow/serving

  • interpret

    Fit interpretable models. Explain blackbox machine learning.

  • tiny-cnn

    header only, dependency-free deep learning framework in C++14

  • oneflow

    OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

  • pixie

    Instant Kubernetes-Native Application Observability

    Project mention: Grafana Beyla: OSS eBPF auto-instrumentation for application observability | news.ycombinator.com | 2023-09-13
  • flashlight

    A C++ standalone library for machine learning (by flashlight)

    Project mention: MatX: Efficient C++17 GPU numerical computing library with Python-like syntax | news.ycombinator.com | 2023-10-03

    I think a comparison to PyTorch, TensorFlow and/or JAX is more relevant than a comparison to CuPy/NumPy.

    And then maybe also a comparison to Flashlight (https://github.com/flashlight/flashlight) or other C/C++ based ML/computing libraries?

    Also, there is no mention of it, so I suppose this does not support automatic differentiation?

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-14.

C++ Machine Learning related posts


What are some of the best open-source Machine Learning projects in C++? This list will help you:

Project Stars
1 tensorflow 181,050
2 tesseract-ocr 56,798
3 Caffe 33,783
4 openpose 29,401
5 C-Plus-Plus 28,674
6 xgboost 25,353
7 mediapipe 24,936
8 DeepSpeech 23,929
9 PaddlePaddle 21,402
10 CNTK 17,434
11 LightGBM 15,890
12 Dlib 12,676
13 onnxruntime 12,145
14 Open3D 10,198
15 vowpal_wabbit 8,386
16 MNN 8,187
17 jetson-inference 7,169
18 serving 6,059
19 interpret 5,925
20 tiny-cnn 5,738
21 oneflow 5,620
22 pixie 5,171
23 flashlight 5,090
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.