SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 C++ neural-network Projects
-
PaddlePaddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
-
mace
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.
-
Simd
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM. (by ermig1979)
-
armnn
Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
-
MocapNET
We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic
-
distributed-llama
Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
-
SegmentationCpp
A c++ trainable semantic segmentation library based on libtorch (pytorch c++). Backbone: VGG, ResNet, ResNext. Architecture: FPN, U-Net, PAN, LinkNet, PSPNet, DeepLab-V3, DeepLab-V3+ by now.
-
vs-mlrt
Efficient CPU/GPU/Vulkan ML Runtimes for VapourSynth (with built-in support for waifu2x, DPIR, RealESRGANv2/v3, Real-CUGAN, RIFE, SCUNet, SwinIR and more!)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: TensorFlow-metal on Apple Mac is junk for training | news.ycombinator.com | 2024-01-16
Click to Learn more...
Project mention: AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source | news.ycombinator.com | 2024-02-12ncnn uses Vulkan for GPU acceleration, I've seen it used in a few projects to get AMD hardware support.
https://github.com/Tencent/ncnn
Yet another TEDIOUS BATTLE: Python vs. C++/C stack.
This project gained popularity due to the HIGH DEMAND for running large models with 1B+ parameters, like `llama`. Python dominates the interface and training ecosystem, but prior to llama.cpp, non-ML professionals showed little interest in a fast C++ interface library. While existing solutions like tensorflow-serving [1] in C++ were sufficiently fast with GPU support, llama.cpp took the initiative to optimize for CPU and trim unnecessary code, essentially code-golfing and sacrificing some algorithm correctness for improved performance, which isn't favored by "ML research".
NOTE: In my opinion, a true pioneer was DarkNet, which implemented the YOLO model series and significantly outperformed others [2]. Same trick basically like llama.cpp
[1] https://github.com/tensorflow/serving
Project mention: MatX: Efficient C++17 GPU numerical computing library with Python-like syntax | news.ycombinator.com | 2023-10-03I think a comparison to PyTorch, TensorFlow and/or JAX is more relevant than a comparison to CuPy/NumPy.
And then maybe also a comparison to Flashlight (https://github.com/flashlight/flashlight) or other C/C++ based ML/computing libraries?
Also, there is no mention of it, so I suppose this does not support automatic differentiation?
Another option is DALI https://github.com/NVIDIA/DALI For my project while training EfficientNet2, it was a game changer. But it a way harder to implement in code than TorchVision or Kornia.
I was curious about these libraries a few weeks ago and did some searching. Is there one that's got a clearly dominating set of users or contributors?
I don't know what a good way to compare these might be, other than perhaps activity/contributor count.
[1] https://github.com/simd-everywhere/simde
[2] https://github.com/ermig1979/Simd
[3] https://github.com/google/highway
[4] https://gitlab.com/libeigen/eigen
[5] https://github.com/shibatch/sleef
Project mention: LeCun: Qualcomm working with Meta to run Llama-2 on mobile devices | news.ycombinator.com | 2023-07-23Like ARM? https://github.com/ARM-software/armnn
Optimization for this workload has arguably been in-progress for decades. Modern AVX instructions can be found in laptops that are a decade old now, and most big inferencing projects are built around SIMD or GPU shaders. Unless your computer ships with onboard Nvidia hardware, there's usually not much difference in inferencing performance.
Project mention: ExecuTorch: Enabling On-Device interference for embedded devices | news.ycombinator.com | 2023-10-17Yes ExecuTorch is currently targeted at Edge devices. The runtime is written in C++ with 50KB binary size (without kernels) and should run in most of platforms. You are right that we have not integrated to Nvidia backend yet. Have you tried torch.compile() in PyTorch 2.0? It would do the Nvidia optimization for you without Torchscript. If you have specific binary size or edge specific request, feel free to file issues in https://github.com/pytorch/executorch/issues
Project mention: [D] Run Pytorch model inference on Microcontroller | /r/MachineLearning | 2023-11-14CMSIS-NN. ARM centric. Examples. They also have an example for a pytorch to tflite converter via onnx
Imagine you have an AI-powered personal alerting chat assistant that interacts using up-to-date data. Whether it's a big move in the stock market that affects your investments, any significant change on your shared SharePoint documents, or discounts on Amazon you were waiting for, the application is designed to keep you informed and alert you about any significant changes based on the criteria you set in advance using your natural language. In this post, we will learn how to build a full-stack event-driven weather alert chat application in Python using pretty cool tools: Streamlit, NATS, and OpenAI. The app can collect real-time weather information, understand your criteria for alerts using AI, and deliver these alerts to the user interface.
or whatever you want, you need to write the code yourself though. https://github.com/AmusementClub/vs-mlrt
C++ neural-network related posts
- Distributed Grok-1 (314B)
- Show HN: Distributed Llama – Run LLMs on multiple devices in parallel
- Distributed Llama
- WyGPT: Minimal mature GPT model in C++
- Hi, What could be the best HLS tool for implementing neural networks on FPGA
- Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields
- Apple previews Live Speech, Personal Voice, and more new accessibility features
-
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024
Index
What are some of the best open-source neural-network projects in C++? This list will help you:
Project | Stars | |
---|---|---|
1 | tensorflow | 182,456 |
2 | PaddlePaddle | 21,616 |
3 | ncnn | 19,176 |
4 | CNTK | 17,435 |
5 | serving | 6,071 |
6 | tiny-cnn | 5,763 |
7 | oneflow | 5,721 |
8 | flashlight | 5,145 |
9 | DALI | 4,914 |
10 | mace | 4,876 |
11 | tiny-cuda-nn | 3,397 |
12 | Simd | 1,971 |
13 | fann | 1,549 |
14 | armnn | 1,117 |
15 | hls4ml | 1,103 |
16 | MocapNET | 803 |
17 | executorch | 710 |
18 | distributed-llama | 708 |
19 | ML-examples | 404 |
20 | SegmentationCpp | 402 |
21 | ONE | 398 |
22 | liboai | 291 |
23 | vs-mlrt | 230 |
Sponsored