Show HN: LlamaGPT – Self-hosted, offline, private AI chatbot, powered by Llama 2

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • llama-gpt

    A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

  • I put up a draft PR to demo how to run it on a GPU: https://github.com/getumbrel/llama-gpt/pull/11

    It breaks other things like model downloading, but once I got it to a working state for myself, I figured why not put it up there in case its useful. If I have time, I'll try to rework it a little bit with more parameters and less dockerfile repetition to fit the main project better.

  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

  • https://github.com/jmorganca/ollama was extremely simple to get running on my M1 and has a couple uncensored models you can just download and use.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • serge

    A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

  • Very cool, this looks like a combination of chatbot-ui and llama-cpp-python? A similar project I've been using is https://github.com/serge-chat/serge. Nous-Hermes-Llama2-13b is my daily driver and scores high on coding evaluations (https://huggingface.co/spaces/mike-ravkine/can-ai-code-resul...).

  • can-ai-code

    Self-evaluating interview for AI coders

  • Very cool, this looks like a combination of chatbot-ui and llama-cpp-python? A similar project I've been using is https://github.com/serge-chat/serge. Nous-Hermes-Llama2-13b is my daily driver and scores high on coding evaluations (https://huggingface.co/spaces/mike-ravkine/can-ai-code-resul...).

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • I like this for turn by turn conversations: https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b

    this for zero shot instructions: https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-...

    easiest way would be https://github.com/oobabooga/text-generation-webui

    a little more complex way I do is I have a stack with llama.cpp server, a openai adapter, and bettergpt as frontend using the openai adapter as the custom endpoint. bettergpt ux beats oogaboga by a long way (and chatgpt on certain aspects)

  • gpt4all

    gpt4all: run open-source LLMs anywhere

  • Agreed.

    Gpt4all[1] offers a similar 'simple setup' but with application exe downloads, but is arguably more like open core because the gpt4all makers (nomic?) want to sell you the vector database addon stuff on top.

    [1]https://github.com/nomic-ai/gpt4all

    I like this one because it feels more private / is not being pushed by a company that can do a rug pull. This can still do a rug pull, but it would be harder to do.

  • llm

    Access large language models from the command-line (by simonw)

  • What is the advantage of this versus running something like https://github.com/simonw/llm , which also gives you options to e.g. use https://github.com/simonw/llm-mlc for accelerated inference?

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • llm-mlc

    LLM plugin for running models using MLC

  • What is the advantage of this versus running something like https://github.com/simonw/llm , which also gives you options to e.g. use https://github.com/simonw/llm-mlc for accelerated inference?

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts