C++ Machine Learning

Open-source C++ projects categorized as Machine Learning

Top 23 C++ Machine Learning Projects

Machine Learning
  1. tensorflow

    An Open Source Machine Learning Framework for Everyone

    Project mention: None of the top 10 projects in GitHub is actually a software project 🤯 | dev.to | 2025-05-10

    We see an addition to the AI community with AutoGPT. Along with Tensorflow they represent the AI community in the software category, which is getting relevant (2 out of 8). We can expect in the future to have new AI projects in the top 25 such as Transformers or Ollama (currently top 34 and 36, respectively).

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. tesseract-ocr

    Tesseract Open Source OCR Engine (main repository)

    Project mention: Mistral OCR | news.ycombinator.com | 2025-03-06

    https://www.home-assistant.io/integrations/seven_segments/

    https://www.unix-ag.uni-kl.de/~auerswal/ssocr/

    https://github.com/tesseract-ocr/tesseract

    https://community.home-assistant.io/t/ocr-on-camera-image-fo...

    https://www.google.com/search?q=home+assistant+ocr+integrati...

    https://www.google.com/search?q=esphome+ocr+sensor

    https://hackaday.com/2021/02/07/an-esp-will-read-your-meter-...

    ...start digging around and you'll likely find something. HA has integrations which can support writing to InfluxDB (local for sure, and you can probably configure it for a remote influxdb).

    You're looking at 1xRaspberry PI, 1xUSB Webcam, 1x"Power Management / humidity management / waterproof electrical box" to stuff it into, and then either YOLO and DIY to shoot over to your influxdb, or set up a Home Assistant and "attach" your frankenbox as some sort of "sensor" or "integration" which spits out metrics and yadayada...

  4. Caffe

    Caffe: a fast open framework for deep learning.

  5. openpose

    OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

  6. C-Plus-Plus

    Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.

  7. mediapipe

    Cross-platform, customizable ML solutions for live and streaming media.

    Project mention: Integrating MediaPipe with DeepSeek for Enhanced AI Performance | dev.to | 2025-02-03

    Code Examples: Check out the MediaPipe and LLM Integration Examples provided by Google AI Edge.

  8. xgboost

    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

    Project mention: What AI/ML Models Should You Use and Why? | dev.to | 2024-10-29

    Boosting Boosting is not a separate ML model but a technique that combines multiple weak learners to create a single model that can generate highly accurate predictions. Xgboost is a common boosting model that supports distributed training, resulting in faster training. According to research by Intel, Xgboost can be more effective than a neural network-based approach for tabular data. In addition, Xgboost is faster to train and doesn’t require as much data as neural networks need.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. DeepSpeech

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

    Project mention: From Voice to Text: Exploring Speech-to-Text Tools and APIs for Developers | dev.to | 2025-05-19

    Setup: Install deepspeech with pip install deepspeech. Download pre-trained models from DeepSpeech Releases. Use a 16kHz mono WAV file.

  11. PaddlePaddle

    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

    Project mention: GPT 4.5 level for 1% of the price | news.ycombinator.com | 2025-03-16

    PaddlePaddle (so good they named it twice) predates Ray and supports both data parallel and model-parallel training. It is still being developed.

    https://github.com/PaddlePaddle/Paddle

    They have pedigry.

  12. CNTK

    Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

    Project mention: Top 8 AI Open Source Software Libraries | dev.to | 2024-07-24

    Github Source Code: CNTK

  13. LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

  14. onnxruntime

    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

    Project mention: Generative AI Interview for Senior Data Scientists: 50 Key Questions and Answers | dev.to | 2025-05-06

    How it works: A model trained in one framework can be converted to the ONNX format. This format can then be run on various hardware or inference engines that support it (e.g., ONNX Runtime). It facilitates easy model transfer and execution even if the development framework and deployment environment differ.

  15. Dlib

    A toolkit for making real world machine learning and data analysis applications in C++

    Project mention: Dlib: Modern C++ toolkit containing machine learning algorithms | news.ycombinator.com | 2025-03-19
  16. video2x

    A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.

  17. ggml

    Tensor library for machine learning

    Project mention: Xiaomi unveils open-source AI reasoning model MiMo | news.ycombinator.com | 2025-04-30

    One of the core design goals Georgi Gerganov had with GGUF was to not need other files. It's literally bullet point #1 in the specs

    >Single-file deployment

    >Full information: all information needed to load a model is contained in the model file, and no additional information needs to be provided by the user.

    https://github.com/ggml-org/ggml/blob/master/docs/gguf.md

    We literally just got rid of that multi file chaos only for ollama to add it back :/

  18. Open3D

    Open3D: A Modern Library for 3D Data Processing

  19. MNN

    MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/README.md)

    Project mention: Alibaba mnn android app support DeepSeek R1 model | news.ycombinator.com | 2025-02-06
  20. vowpal_wabbit

    Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

  21. oneflow

    OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

  22. catboost

    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

    Project mention: 🚀 Why Your ML Service Needs Rust + CatBoost: A Setup Guide That Actually Works | dev.to | 2025-01-19

    [package] name = "MLApp" version = "0.1.0" edition = "2021" [dependencies] catboost = { git = "https://github.com/catboost/catboost", rev = "0bfdc35"}

  23. jetson-inference

    Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

  24. interpret

    Fit interpretable models. Explain blackbox machine learning.

  25. serving

    A flexible, high-performance serving system for machine learning models

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Machine Learning discussion

Log in or Post with

C++ Machine Learning related posts

  • Baby Steps into Genetic Programming

    1 project | news.ycombinator.com | 7 Apr 2025
  • How to Create Vector Embeddings in Node.js

    1 project | dev.to | 3 Apr 2025
  • Dlib: Modern C++ toolkit containing machine learning algorithms

    1 project | news.ycombinator.com | 19 Mar 2025
  • GPT 4.5 level for 1% of the price

    1 project | news.ycombinator.com | 16 Mar 2025
  • Show HN: Txeo – A Modern C++ Wrapper for TensorFlow

    3 projects | news.ycombinator.com | 21 Feb 2025
  • Train a Mnist VAE with C and CUDA

    1 project | news.ycombinator.com | 21 Dec 2024
  • Unlocking DuckDB from Anywhere - A Guide to Remote Access with Apache Arrow and Flight RPC (gRPC)

    4 projects | dev.to | 12 Dec 2024
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 20 May 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Machine Learning projects in C++? This list will help you:

# Project Stars
1 tensorflow 189,943
2 tesseract-ocr 66,835
3 Caffe 34,356
4 openpose 32,454
5 C-Plus-Plus 31,775
6 mediapipe 29,728
7 xgboost 26,922
8 DeepSpeech 26,336
9 PaddlePaddle 22,774
10 CNTK 17,552
11 LightGBM 17,217
12 onnxruntime 16,629
13 Dlib 13,994
14 video2x 13,351
15 ggml 12,521
16 Open3D 12,300
17 MNN 10,862
18 vowpal_wabbit 8,565
19 oneflow 8,448
20 catboost 8,394
21 jetson-inference 8,287
22 interpret 6,496
23 serving 6,281

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com