Ask HN: What is the current (Apr. 2024) gold standard of running an LLM locally?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

ollama

195 58,943 9.9 Go

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

Ollama is really easy.
brew install ollama
brew services start ollama
ollama pull mistral
Ollama has an http interface that provides a consistent interface for prompting, regardless of model.
https://github.com/ollama/ollama/blob/main/docs/api.md#reque...

transformerlab-app

3 185 9.7 TypeScript

Experiment with Large Language Models

Some of the tools offer a path to doing tool use (fetching URLs and doing things with them) or RAG (searching your documents). I think Oobabooga https://github.com/oobabooga/text-generation-webui offers the latter through plugins.
Our tool, https://github.com/transformerlab/transformerlab-app also supports the latter (document search) using local llms.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
llama-cpp-rs

1 65 9.7 Rust
mlx

23 14,087 9.8 C++

MLX: An array framework for Apple silicon

If you're able to purchase a separate GPU, the most popular option is to get an NVIDIA RTX3090 or RTX4090.
Apple Mac M2 or M3's are becoming a viable option because of MLX https://github.com/ml-explore/mlx . If you are getting an M series Mac for LLMs, I'd recommend getting something with 24GB or more of RAM.

chat-with-doc

1 1 5.1 Python

A Web application that enables users to upload documents and engage in conversational analysis with self-hosted large language models (LLMs) deployed on Ollama.
jan

14 17,643 10.0 TypeScript

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
text-generation-webui

876 36,293 9.9 Python

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

Some of the tools offer a path to doing tool use (fetching URLs and doing things with them) or RAG (searching your documents). I think Oobabooga https://github.com/oobabooga/text-generation-webui offers the latter through plugins.
Our tool, https://github.com/transformerlab/transformerlab-app also supports the latter (document search) using local llms.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
llm_steer-oobabooga

4 26 6.6 Python

Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors, now in oobabooga text generation webui!

I agree with you re:extensions. I know something of the extension ecosystem for it - I built one - https://github.com/Hellisotherpeople/llm_steer-oobabooga
Oobabooga the closest thing we have to maximalism. It has exposure for by far the largest number of parameters/settings/backends compared to all others.
My main point is that the world yearns for a proper "Photoshop for text" - and no one has even tried to make this (closest is oobabooga). All VC backed competitors are not even close to the mark on what they should be doing here.

llm-chatbot-rag

1 27 5.9 Jupyter Notebook

A local LLM chatbot with RAG for PDF input files

Paywall article: https://towardsdatascience.com/how-to-build-a-local-open-sou...
Source code: https://github.com/leoneversberg/llm-chatbot-rag

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project