C++ Inference

Open-source C++ projects categorized as Inference

Top 15 C++ Inference Projects

  1. whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    Project mention: Building a personal, private AI computer on a budget | news.ycombinator.com | 2025-02-11

    A great thread with the type of info your looking for lives here: https://github.com/ggerganov/whisper.cpp/issues/89

    But you can likely find similar threads for the llama.cpp benchmark here: https://github.com/ggerganov/llama.cpp/tree/master/examples/...

    These are good examples because the llama.cpp and whisper.cpp benchmarks take full advantage of the Apple hardware but also take full advantage of non-Apple hardware with GPU support, AVX support etc.

    It’s been true for a while now that the memory bandwidth of modern Apple systems in tandem with the neural cores and gpu has made them very competitive Nvidia for local inference and even training.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. mediapipe

    Cross-platform, customizable ML solutions for live and streaming media.

    Project mention: Integrating MediaPipe with DeepSeek for Enhanced AI Performance | dev.to | 2025-02-03

    Code Examples: Check out the MediaPipe and LLM Integration Examples provided by Google AI Edge.

  4. ncnn

    ncnn is a high-performance neural network inference framework optimized for the mobile platform

    Project mention: OpenMP 6.0 | news.ycombinator.com | 2024-11-14
  5. TensorRT

    NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

    Project mention: The 6 Best LLM Tools To Run Models Locally | dev.to | 2024-08-29

    Extensions: Jan supports extensions like TensortRT and Inference Nitro for customizing and enhancing your AI models.

  6. jetson-inference

    Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

  7. openvino

    OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

    Project mention: Court is in session: Top 10 most notorious C and C++ errors in 2024 | dev.to | 2024-12-28

    V766 An item with the same key '"SoftPlus"' has already been added. cpu_types.cpp 198

  8. TNN

    TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts

  9. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  10. CTranslate2

    Fast inference engine for Transformer models

    Project mention: Brood War Korean Translations | news.ycombinator.com | 2025-01-17

    Thanks for the added context on the builds! As "foreign" BW player and fellow speech processing researcher, I agree shallow contextual biasing should help. While not difficult to implement, most generally available ASR solutions don't make it easy to use. There's a PR in ctranslate2 implementing the same feature so that it could be exposed in faster-whisper: https://github.com/OpenNMT/CTranslate2/pull/1789

  11. lightseq

    LightSeq: A High Performance Library for Sequence Processing and Generation

  12. bark.cpp

    Suno AI's Bark model in C/C++ for fast text-to-speech generation

  13. cppflow

    Run TensorFlow models in C++ without installation and without Bazel

  14. tensorrt-cpp-api

    TensorRT C++ API Tutorial

  15. dlstreamer

    This repository is a home to Intel® Deep Learning Streamer (Intel® DL Streamer) Pipeline Framework. Pipeline Framework is a streaming media analytics framework, based on GStreamer* multimedia framework, for creating complex media analytics pipelines.

  16. onnxruntime_backend

    The Triton backend for the ONNX Runtime.

    Project mention: Zero-Shot Text Classification on a low-end CPU-only machine? | news.ycombinator.com | 2024-10-07

    Hah, it actually gets worse. What I was describing was the Triton ONNX backend with the OpenVINO execution accelerator[0] (not the OpenVINO backend itself). Clear as mud, right?

    Your issue here is model performance with the additional challenge of offering it over a network socket across multiple requests and doing so in a performant manner.

    Triton does things like dynamic batching[1] where throughput is increased significantly by aggregating disparate requests into one pass through the GPU.

    A docker container for torch, ONNX, OpenVINO, etc isn't even natively going to offer a network socket. This is where people try to do things like rolling their own FastAPI API implementation (or something) only to discover it completely falls apart at any kind of load. That's development effort as well but it's a waste of time.

    [0] - https://github.com/triton-inference-server/onnxruntime_backe...

    [1] - https://docs.nvidia.com/deeplearning/triton-inference-server...

  17. EasyOCR-cpp

    Custom C++ implementation of deep learning based OCR

  18. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Inference discussion

Log in or Post with

C++ Inference related posts

  • Whisper.cpp: Looking for Maintainers

    1 project | news.ycombinator.com | 4 Feb 2025
  • Court is in session: Top 10 most notorious C and C++ errors in 2024

    2 projects | dev.to | 28 Dec 2024
  • OpenMP 6.0

    2 projects | news.ycombinator.com | 14 Nov 2024
  • OpenVINO's AI Success: Brilliance or Cracks Beneath the Surface?

    1 project | dev.to | 18 Sep 2024
  • 12 moments of typos and copy-paste, or why AI hallucinates: checking OpenVINO

    1 project | dev.to | 26 Jun 2024
  • Intel releases OpenVINO 2024.2 with broader LLM and quantization support

    1 project | news.ycombinator.com | 18 Jun 2024
  • Show HN: I ported Suno AI's Bark model in C for fast realistic audio generation

    1 project | news.ycombinator.com | 24 Apr 2024
  • A note from our sponsor - CodeRabbit
    coderabbit.ai | 24 Apr 2025
    Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →

Index

What are some of the best open-source Inference projects in C++? This list will help you:

# Project Stars
1 whisper.cpp 39,345
2 mediapipe 29,432
3 ncnn 21,332
4 TensorRT 11,478
5 jetson-inference 8,241
6 openvino 8,129
7 TNN 4,499
8 CTranslate2 3,751
9 lightseq 3,244
10 bark.cpp 802
11 cppflow 799
12 tensorrt-cpp-api 693
13 dlstreamer 548
14 onnxruntime_backend 141
15 EasyOCR-cpp 55

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that C++ is
the 7th most popular programming language
based on number of references?