-
TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Nvidia Quantization | Quantization with TensorRT
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Nebullvm | Easy-to-use library to boost AI inference leveraging state-of-the-art optimization techniques
-
aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Qualcomm AIMET | Advanced quantization and compression techniques for trained neural network models
-
Model-Compression-Research-Package
A library for researching neural networks compression and acceleration methods.
Intel Labs compression | Researching neural networks compression and acceleration methods.
Related posts
-
AMD MI300X 30% higher performance than Nvidia H100, even with optimized stack
-
Getting SDXL-turbo running with tensorRT
-
Show HN: Ollama for Linux – Run LLMs on Linux with GPU Acceleration
-
Train Your AI Model Once and Deploy on Any Cloud
-
A1111 just added support for TensorRT for webui as an extension!