C++ Inference

Open-source C++ projects categorized as Inference

Top 13 C++ Inference Projects

  1. whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    Project mention: Show HN: OWhisper – Ollama for realtime speech-to-text | news.ycombinator.com | 2025-08-14

    Thank you for taking the time to build something and share it. However what is the advantage of using this over whisper.cpp stream that can also do real time conversion?

    https://github.com/ggml-org/whisper.cpp/tree/master/examples...

  2. JetBrains

    Tell us how you use coding tools. You may win a prize! Are you a developer or a data analyst? Share your thoughts about your coding tools in our short survey and get a chance to win prizes!

    JetBrains logo
  3. mediapipe

    Cross-platform, customizable ML solutions for live and streaming media.

    Project mention: Google AI Edge – on-device cross-platform AI deployment | news.ycombinator.com | 2025-06-01

    This isn't really true. They are different offerings.

    CoreML is specific to the Apple ecosystem and lets you convert a PyTorch model to a CoreML .mlmodel that will run with acceleration on iOS/Mac.

    Google Mediapipe is a giant C++ library for running ML flows on any device (iOS/Android/Web). It includes Tensorflow Lite (now LiteRT) but is also a graph processor that helps with common ML preprocessing tasks like image resizing, annotating, etc.

    Google killing products early is a good meme but Mediapipe is open source so you can at least credit them with that. https://github.com/google-ai-edge/mediapipe

    I used a fork of Mediapipe for a contract iOS/Android computer vision product and it was very complex but worked well. A cross-platform solution would not have been possible with CoreML.

  4. ncnn

    ncnn is a high-performance neural network inference framework optimized for the mobile platform

    Project mention: OpenMP 6.0 | news.ycombinator.com | 2024-11-14
  5. TensorRT

    NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

    Project mention: Generative AI Interview for Senior Data Scientists: 50 Key Questions and Answers | dev.to | 2025-05-06

    What is the purpose of using ONNX or TensorRT for deployment? When deploying a trained deep learning model into a real-world service environment for inference, optimization to increase execution speed and reduce resource consumption is crucial. ONNX and TensorRT are prominent tools and frameworks widely used for this purpose.

  6. openvino

    OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

    Project mention: Court is in session: Top 10 most notorious C and C++ errors in 2024 | dev.to | 2024-12-28

    V766 An item with the same key '"SoftPlus"' has already been added. cpu_types.cpp 198

  7. jetson-inference

    Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

  8. TNN

    TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. CTranslate2

    Fast inference engine for Transformer models

    Project mention: Brood War Korean Translations | news.ycombinator.com | 2025-01-17

    Thanks for the added context on the builds! As "foreign" BW player and fellow speech processing researcher, I agree shallow contextual biasing should help. While not difficult to implement, most generally available ASR solutions don't make it easy to use. There's a PR in ctranslate2 implementing the same feature so that it could be exposed in faster-whisper: https://github.com/OpenNMT/CTranslate2/pull/1789

  11. bark.cpp

    Suno AI's Bark model in C/C++ for fast text-to-speech generation

  12. cppflow

    Run TensorFlow models in C++ without installation and without Bazel

  13. tensorrt-cpp-api

    TensorRT C++ API Tutorial

  14. onnxruntime_backend

    The Triton backend for the ONNX Runtime.

    Project mention: Zero-Shot Text Classification on a low-end CPU-only machine? | news.ycombinator.com | 2024-10-07

    Hah, it actually gets worse. What I was describing was the Triton ONNX backend with the OpenVINO execution accelerator[0] (not the OpenVINO backend itself). Clear as mud, right?

    Your issue here is model performance with the additional challenge of offering it over a network socket across multiple requests and doing so in a performant manner.

    Triton does things like dynamic batching[1] where throughput is increased significantly by aggregating disparate requests into one pass through the GPU.

    A docker container for torch, ONNX, OpenVINO, etc isn't even natively going to offer a network socket. This is where people try to do things like rolling their own FastAPI API implementation (or something) only to discover it completely falls apart at any kind of load. That's development effort as well but it's a waste of time.

    [0] - https://github.com/triton-inference-server/onnxruntime_backe...

    [1] - https://docs.nvidia.com/deeplearning/triton-inference-server...

  15. EasyOCR-cpp

    Custom C++ implementation of deep learning based OCR

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Inference discussion

Log in or Post with

C++ Inference related posts

  • Whispercpp – Local, Fast, and Private Audio Transcription for Ruby

    1 project | news.ycombinator.com | 7 Jun 2025
  • Build Your Own Siri. Locally. On-Device. No Cloud

    1 project | news.ycombinator.com | 13 May 2025
  • Whisper.cpp: Looking for Maintainers

    1 project | news.ycombinator.com | 4 Feb 2025
  • Court is in session: Top 10 most notorious C and C++ errors in 2024

    2 projects | dev.to | 28 Dec 2024
  • OpenMP 6.0

    2 projects | news.ycombinator.com | 14 Nov 2024
  • OpenVINO's AI Success: Brilliance or Cracks Beneath the Surface?

    1 project | dev.to | 18 Sep 2024
  • 12 moments of typos and copy-paste, or why AI hallucinates: checking OpenVINO

    1 project | dev.to | 26 Jun 2024
  • A note from our sponsor - JetBrains
    surveys.jetbrains.com | 1 Sep 2025
    Are you a developer or a data analyst? Share your thoughts about your coding tools in our short survey and get a chance to win prizes! Learn more →

Index

What are some of the best open-source Inference projects in C++? This list will help you:

# Project Stars
1 whisper.cpp 42,817
2 mediapipe 31,139
3 ncnn 21,982
4 TensorRT 12,079
5 openvino 8,764
6 jetson-inference 8,472
7 TNN 4,572
8 CTranslate2 3,992
9 bark.cpp 835
10 cppflow 802
11 tensorrt-cpp-api 757
12 onnxruntime_backend 159
13 EasyOCR-cpp 59

Sponsored
Tell us how you use coding tools. You may win a prize!
Are you a developer or a data analyst? Share your thoughts about your coding tools in our short survey and get a chance to win prizes!
surveys.jetbrains.com

Did you know that C++ is
the 7th most popular programming language
based on number of references?