Top 23 Tensorrt Open-Source Projects

TensorRT

22 9,110 5.0 C++

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Project mention: AMD MI300X 30% higher performance than Nvidia H100, even with optimized stack | news.ycombinator.com | 2023-12-17

> It's not rocket science to implement matrix multiplication in any GPU.
You're right, it's harder. Saying this as someone who's done more work on the former than the latter. (I have, with a team, built a rocket engine. And not your school or backyard project size, but nozzle bigger than your face kind. I've also written CUDA kernels and boy is there a big learning curve to the latter that you gotta fundamentally rethink how you view a problem. It's unquestionable why CUDA devs are paid so much. Really it's only questionable why they aren't paid more)
I know it is easy to think this problem is easy, it really looks that way. But there's an incredible amount of optimization that goes into all of this and that's what's really hard. You aren't going to get away with just N for loops for a tensor rank N. You got to chop the data up, be intelligent about it, manage memory, how you load memory, handle many data types, take into consideration different results for different FMA operations, and a whole lot more. There's a whole lot of non-obvious things that result in high optimization (maybe obvious __after__ the fact, but that's not truthfully "obvious"). The thing is, the space is so well researched and implemented that you can't get away with naive implementations, you have to be on the bleeding edge.
Then you have to do that and make it reasonably usable for the programmer too, abstracting away all of that. Cuda also has a huge head start and momentum is not a force to be reckoned with (pun intended).
Look at TensorRT[0]. The software isn't even complete and it still isn't going to cover all neural networks on all GPUs. I've had stuff work on a V100 and H100 but not an A100, then later get fixed. They even have the "Apple Advantage" in that they have control of the hardware. I'm not certain AMD will have the same advantage. We talk a lot about the difficulties of being first mover, but I think we can also recognize that momentum is an advantage of being first mover. And it isn't one to scoff at.
[0] https://github.com/NVIDIA/TensorRT

YOLOX

12 9,012 1.5 Python

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
jetson-inference

11 7,349 7.7 C++

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
tensorrtx

3 6,584 8.4 C++

Implementation of popular deep learning networks with TensorRT network definition API
yolo_tracking

8 6,110 9.9 Python

BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
torch2trt

5 4,395 3.1 Python

An easy to use PyTorch to TensorRT converter
TNN

1 4,281 2.5 C++

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
yolov7_d2

4 3,130 0.0 Python

🔥🔥🔥🔥 (Earlier YOLOv7 not official one) YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! 🔥🔥🔥
FastDeploy

5 2,705 7.5 C++

⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.

Project mention: Testing YOLO on Orange Pi 5 | /r/OrangePI | 2023-07-09

mmdeploy

4 2,511 7.9 Python

OpenMMLab Model Deployment Framework

Project mention: [D] Object detection models that can be easily converted to CoreML | /r/MachineLearning | 2023-07-25

deepdetect

4 2,495 6.7 C++

Deep Learning API and Server in C++14 support for Caffe, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE

Project mention: Exploring Open-Source Alternatives to Landing AI for Robust MLOps | dev.to | 2023-12-13

For those seeking a lightweight solution for setting up deep learning REST APIs across platforms without the complexity of Kubernetes, Deepdetect is worth considering.

tensorRT_Pro

1 2,381 3.1 C++

C++ library based on tensorrt integration
TensorRT

5 2,340 9.5 Python

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT (by pytorch)
tensorflow-yolov4-tflite

4 2,223 0.0 Python

YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite
yolov5-face

2 1,941 3.7 Python

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)
tensorrt_demos

5 1,720 3.1 Python

TensorRT MODNet, YOLOv4, YOLOv3, SSD, MTCNN, and GoogLeNet
GenerativeAIExamples

1 1,535 7.5 Python

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

Project mention: FLaNK Weekly 18 Dec 2023 | dev.to | 2023-12-18

FastMOT

2 1,095 0.0 Python

High-performance multiple object tracking based on YOLO, Deep SORT, and KLT 🚀
inference

5 1,022 9.9 Python

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models. (by roboflow)

Project mention: Supervision: Reusable Computer Vision | news.ycombinator.com | 2024-03-24

Yeah, inference[1] is our open source package for running locally (either directly in Python or via a Docker container). It works with all the models on Universe, models you train yourself (assuming we support the architecture; we have a bunch of notebooks available[2]), or train in our platform, plus several more general foundation models[3] (for things like embeddings, zero-shot detection, question answering, OCR, etc).
We also have a hosted API[4] you can hit for most models we support (except some of the large vision models that are really GPU-heavy) if you prefer.
[1] https://github.com/roboflow/inference
[2] https://github.com/roboflow/notebooks
[3] https://inference.roboflow.com/foundation/about/
[4] https://docs.roboflow.com/deploy/hosted-api

Radiata

8 983 8.1 Python

Stable diffusion webui based on diffusers.

Project mention: 🌠🌟Radiata TensorRT WebUI ⚡🏎️💨 | /r/DeepFloydIF | 2023-06-02

Stable-Diffusion-NCNN

8 935 4.9 C++

Stable Diffusion in NCNN with c++, supported txt2img and img2img

Project mention: Stable Diffusion implemented by ncnn framework based on C++, supported txt2img and img2img! | /r/StableDiffusion | 2023-06-08

trt_pose

3 921 0.0 Python

Real-time pose estimation accelerated with NVIDIA TensorRT
TensorRT-For-YOLO-Series

1 793 3.5 C++

tensorrt for yolo series (YOLOv8, YOLOv7, YOLOv6, YOLOv5), nms plugin support
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Tensorrt related posts

AMD MI300X 30% higher performance than Nvidia H100, even with optimized stack

1 project | news.ycombinator.com | 17 Dec 2023
Getting SDXL-turbo running with tensorRT

1 project | /r/StableDiffusion | 6 Dec 2023
Show HN: Ollama for Linux – Run LLMs on Linux with GPU Acceleration

14 projects | news.ycombinator.com | 26 Sep 2023
Nvidia Introduces TensorRT-LLM for Accelerating LLM Inference on H100/A100 GPUs

3 projects | news.ycombinator.com | 8 Sep 2023
[D] Object detection models that can be easily converted to CoreML

1 project | /r/MachineLearning | 25 Jul 2023
Train Your AI Model Once and Deploy on Any Cloud

3 projects | news.ycombinator.com | 8 Jul 2023
🌠🌟Radiata TensorRT WebUI ⚡🏎️💨

1 project | /r/DeepFloydIF | 2 Jun 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 1 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Tensorrt projects? This list will help you:

	Project	Stars
1	TensorRT	9,110
2	YOLOX	9,012
3	jetson-inference	7,349
4	tensorrtx	6,584
5	yolo_tracking	6,110
6	torch2trt	4,395
7	TNN	4,281
8	yolov7_d2	3,130
9	FastDeploy	2,705
10	mmdeploy	2,511
11	deepdetect	2,495
12	tensorRT_Pro	2,381
13	TensorRT	2,340
14	tensorflow-yolov4-tflite	2,223
15	yolov5-face	1,941
16	tensorrt_demos	1,720
17	GenerativeAIExamples	1,535
18	FastMOT	1,095
19	inference	1,022
20	Radiata	983
21	Stable-Diffusion-NCNN	935
22	trt_pose	921
23	TensorRT-For-YOLO-Series	793