Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 6 efficient-inference Open-Source Projects
-
Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: A look at Apple’s new Transformer-powered predictive text model | news.ycombinator.com | 2023-09-16I'm pretty fatigued on constantly providing references and sources in this thread but an example of what they've made availably publicly:
https://github.com/snap-research/EfficientFormer
Project mention: DeepCache: Accelerating Diffusion Models for Free | news.ycombinator.com | 2023-12-05
Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.
efficient-inference related posts
-
Llama33B vs Falcon40B vs MPT30B
-
Has anyone tried out Squeezellm?
-
SqueezeLLM: Dense-and-Sparse Quantization
-
New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.
-
Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems
-
GNN for computer vision, beating CNN & Transformer
-
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024
Index
What are some of the best open-source efficient-inference projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Efficient-AI-Backbones | 3,804 |
2 | LLMCompiler | 1,069 |
3 | EfficientFormer | 944 |
4 | DeepCache | 603 |
5 | SqueezeLLM | 569 |
6 | KVQuant | 190 |
Sponsored