LocalAI: OpenAI compatible API to run LLM models locally on consumer grade hardware!

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • LocalAI

    :robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

  • check out the examples, you can plug in chatbot-ui! https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui

  • FastChat

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

  • https://chat.lmsys.org/ now has a "chatbot arena" where you can pick two models and see their simultaneous responses to the same prompt. The demo service they're using is open source (https://github.com/lm-sys/FastChat ) and some of the models they're using are also open source, but the majority of them are patches on top of the leaked Meta llama one and thus are of questionable licensing

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • gpt4all

    gpt4all: run open-source LLMs anywhere

  • Some of them are available I'm huggingface, you can search for "ggml". I've listed in the README few of the most common ones like https://github.com/nomic-ai/gpt4all . I'm working as well to simplify getting free licensed models in a more maintainable way, and ease out that in the API, stay tuned!

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • How does this compare to https://github.com/oobabooga/text-generation-webui ?

  • gpt-discord-bot

    Example Discord bot written in Python that uses the completions API to have conversations with the `text-davinci-003` model, and the moderations API to filter the messages.

  • ROCm

    Discontinued AMD ROCm™ Software - GitHub Home [Moved to: https://github.com/ROCm/ROCm]

  • Yeah, looks like there might be some current/future support for some of these LLMs with amd; https://github.com/RadeonOpenCompute/ROCm/discussions/1836 It’s just not as robust for what I understand unfortunately.

  • turbopilot

    Discontinued Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • alpaca.cpp

    Discontinued Locally run an Instruction-Tuned Chat-Style LLM

  • try the instructions on this github repo https://github.com/antimatter15/alpaca.cpp, its not the best one but I was able to run this model on my linux machine with 16GB memory, I think its a good starting point.

  • serge

    A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

  • Thanks for sharing your hard work. How would you say LocalAI differs from Serge?

  • web-llm

    Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

  • Now that WebGPU is coming in Chrome 113, I am hoping to see more "in the browser" LLM's, like the amazing demo from MLC AI https://mlc.ai/web-llm/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts