Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • llama.cpp

    LLM inference in C/C++

  • The speedup would not be that high in practice for folks already using speculative sampling[1]. ANPD appears to be similar but uses a simpler, faster, and less accurate drafting approach. These two enhancements can't be meaningfully stacked.

    [1] https://github.com/ggerganov/llama.cpp/pull/2926

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

  • The HuggingFace transformers library already has support for a similar method called prompt lookup decoding that uses the existing context to generate an ngram model: https://github.com/huggingface/transformers/issues/27722

    I don't think it would be that hard to switch it out for a pretrained ngram model.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • AI enthusiasm #6 - Finetune any LLM you want💡

    2 projects | dev.to | 16 Apr 2024
  • Schedule-Free Learning – A New Way to Train

    3 projects | news.ycombinator.com | 6 Apr 2024
  • Gemma doesn't suck anymore – 8 bug fixes

    3 projects | news.ycombinator.com | 11 Mar 2024
  • HuggingFace Transformers: Qwen2

    1 project | news.ycombinator.com | 11 Jan 2024
  • HuggingFace Transformers Release v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2

    1 project | news.ycombinator.com | 13 Dec 2023