rustformers/llm: Run inference for Large Language Models on CPU, with Rust πŸ¦€πŸš€πŸ¦™

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • llm

    An ecosystem of Rust libraries for working with large language models

  • wonnx has done some fantastic work in this regard, so that's where we plan to start once we get there. In terms of general discussion of alternate backends, see this issue.

  • wonnx

    A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web

  • wonnx has done some fantastic work in this regard, so that's where we plan to start once we get there. In terms of general discussion of alternate backends, see this issue.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • llama-dfdx

    LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

  • Not a maintainer, but dfdx can run llama with CUDA!

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts