llm: a Rust crate/CLI for CPU inference of LLMs, including LLaMA, GPT-NeoX, GPT-J and more

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • llm

    An ecosystem of Rust libraries for working with large language models

  • We're looking for feedback on the project, and we'd love to hear from you! If you're interested in contributing, please reach out to us on our Discord, or post an issue on our GitHub.

  • ggml

    Tensor library for machine learning

  • The current direction most people are taking is to "fine-tune" an existing base model (something like StableLM or Dolly) using a technique called LoRA. Ref: https://github.com/tloen/alpaca-lora People are also looking into fine-tuning directly on the GGML format. Ref: https://github.com/ggerganov/ggml/issues/8

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • llama.cpp

    LLM inference in C/C++

  • At present, we are powered by ggml (similar to llama.cpp), but we intend to add additional backends in the near-future. This means that we currently only support CPU inference, but we have several ideas in mind for how to add GPU support, as well as other accelerators.

  • alpaca-lora

    Instruct-tune LLaMA on consumer hardware

  • The current direction most people are taking is to "fine-tune" an existing base model (something like StableLM or Dolly) using a technique called LoRA. Ref: https://github.com/tloen/alpaca-lora People are also looking into fine-tuning directly on the GGML format. Ref: https://github.com/ggerganov/ggml/issues/8

  • tch-rs

    Rust bindings for the C++ api of PyTorch.

  • You could try looking at the min-GPT example of tch-rs. I'd also strongly suggest watching Karpathy's video to understand what's going on.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts