efficient-inference

Open-source projects categorized as efficient-inference

Top 6 efficient-inference Open-Source Projects

  • Efficient-AI-Backbones

    Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

  • LLMCompiler

    LLMCompiler: An LLM Compiler for Parallel Function Calling

  • Project mention: FLaNK Weekly 18 Dec 2023 | dev.to | 2023-12-18
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • EfficientFormer

    EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

  • Project mention: A look at Apple’s new Transformer-powered predictive text model | news.ycombinator.com | 2023-09-16

    I'm pretty fatigued on constantly providing references and sources in this thread but an example of what they've made availably publicly:

    https://github.com/snap-research/EfficientFormer

  • DeepCache

    [CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

  • Project mention: DeepCache: Accelerating Diffusion Models for Free | news.ycombinator.com | 2023-12-05
  • SqueezeLLM

    [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

  • Project mention: Llama33B vs Falcon40B vs MPT30B | /r/LocalLLaMA | 2023-07-05

    Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.

  • KVQuant

    KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

  • Project mention: 10M Tokens LLM Context | news.ycombinator.com | 2024-02-02
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

efficient-inference related posts

  • Llama33B vs Falcon40B vs MPT30B

    2 projects | /r/LocalLLaMA | 5 Jul 2023
  • Has anyone tried out Squeezellm?

    1 project | /r/LocalLLaMA | 2 Jul 2023
  • SqueezeLLM: Dense-and-Sparse Quantization

    1 project | news.ycombinator.com | 15 Jun 2023
  • New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.

    2 projects | /r/LocalLLaMA | 14 Jun 2023
  • Researchers From China Introduce Vision GNN (ViG): A Graph Neural Network For Computer Vision Systems

    1 project | /r/machinelearningnews | 8 Jun 2022
  • GNN for computer vision, beating CNN & Transformer

    1 project | /r/deeplearning | 4 Jun 2022
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 2 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source efficient-inference projects? This list will help you:

Project Stars
1 Efficient-AI-Backbones 3,804
2 LLMCompiler 1,069
3 EfficientFormer 944
4 DeepCache 603
5 SqueezeLLM 569
6 KVQuant 190

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com