Ask HN: What is the current (Apr. 2024) gold standard of running an LLM locally?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

  • Ollama is really easy.

    brew install ollama

    brew services start ollama

    ollama pull mistral

    Ollama has an http interface that provides a consistent interface for prompting, regardless of model.

    https://github.com/ollama/ollama/blob/main/docs/api.md#reque...

  • transformerlab-app

    Experiment with Large Language Models

  • Some of the tools offer a path to doing tool use (fetching URLs and doing things with them) or RAG (searching your documents). I think Oobabooga https://github.com/oobabooga/text-generation-webui offers the latter through plugins.

    Our tool, https://github.com/transformerlab/transformerlab-app also supports the latter (document search) using local llms.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • llama-cpp-rs

  • mlx

    MLX: An array framework for Apple silicon

  • If you're able to purchase a separate GPU, the most popular option is to get an NVIDIA RTX3090 or RTX4090.

    Apple Mac M2 or M3's are becoming a viable option because of MLX https://github.com/ml-explore/mlx . If you are getting an M series Mac for LLMs, I'd recommend getting something with 24GB or more of RAM.

  • chat-with-doc

    A Web application that enables users to upload documents and engage in conversational analysis with self-hosted large language models (LLMs) deployed on Ollama.

  • jan

    Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • Some of the tools offer a path to doing tool use (fetching URLs and doing things with them) or RAG (searching your documents). I think Oobabooga https://github.com/oobabooga/text-generation-webui offers the latter through plugins.

    Our tool, https://github.com/transformerlab/transformerlab-app also supports the latter (document search) using local llms.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • llm_steer-oobabooga

    Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors, now in oobabooga text generation webui!

  • I agree with you re:extensions. I know something of the extension ecosystem for it - I built one - https://github.com/Hellisotherpeople/llm_steer-oobabooga

    Oobabooga the closest thing we have to maximalism. It has exposure for by far the largest number of parameters/settings/backends compared to all others.

    My main point is that the world yearns for a proper "Photoshop for text" - and no one has even tried to make this (closest is oobabooga). All VC backed competitors are not even close to the mark on what they should be doing here.

  • llm-chatbot-rag

    A local LLM chatbot with RAG for PDF input files

  • Paywall article: https://towardsdatascience.com/how-to-build-a-local-open-sou...

    Source code: https://github.com/leoneversberg/llm-chatbot-rag

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts