Top 23 large-language-model Open-Source Projects

gpt_academic

2 55,872 9.8 Python

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Project mention: Enhance Speed of AnkiBrain Addon | /r/ankibrain | 2023-12-06

I recently managed to manually install the AnkiBrain addon, utilizing my personal ChatGPT API key. I'd like to extend my appreciation for creating such a useful tool. However, I've noticed a significant difference in speed compared to a local GUI, similar to what's offered by GPT Academic.

llm-course

6 29,169 8.1 Jupyter Notebook

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Project mention: Ask HN: People who switched from GPT to their own models. How was it? | news.ycombinator.com | 2024-02-26

This is a very nice resource: https://github.com/mlabonne/llm-course

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Flowise

21 24,074 9.9 TypeScript

Drag & drop UI to build your customized LLM flow

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

LLaMA-Factory

2 20,248 9.9 Python

Unify Efficient Fine-Tuning of 100+ LLMs

Project mention: Show HN: GPU Prices on eBay | news.ycombinator.com | 2024-02-23

Depends what model you want to train, and how well you want your computer to keep working while you're doing it.
If you're interested in large language models there's a table of vram requirements for fine-tuning at [1] which says you could do the most basic type of fine-tuning on a 7B parameter model with 8GB VRAM.
You'll find that training takes quite a long time, and as a lot of the GPU power is going on training, your computer's responsiveness will suffer - even basic things like scrolling in your web browser or changing tabs uses the GPU, after all.
Spend a bit more and you'll probably have a better time.
[1] https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...

Chinese-LLaMA-Alpaca

4 17,348 8.3 Python

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Project mention: Chinese-Alpaca-Plus-13B-GPTQ | /r/LocalLLaMA | 2023-05-30

I'd like to share with you today the Chinese-Alpaca-Plus-13B-GPTQ model, which is the GPTQ format quantised 4bit models of Yiming Cui's Chinese-LLaMA-Alpaca 13B for GPU reference.

langflow

28 17,467 10.0 JavaScript

⛓️ Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity.

Project mention: News DataStax just bought our startup Langflow | news.ycombinator.com | 2024-04-04

Hey folks I'm the Head of DevRel @ DataStax here and just wanted to share to the HN community that in conjunction with this big acquisition news, the LF team has shipped 1.0-alpha of Langflow.
It's a simple `pip install` and the team would love any and all feedback!
https://github.com/logspace-ai/langflow/

ChatGLM2-6B

4 15,495 7.0 Python

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Project mention: Are We Overlooking China's Progress in AI? | /r/singularity | 2023-06-26

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
LLMs-from-scratch

7 14,142 9.6 Jupyter Notebook

Implementing a ChatGPT-like LLM from scratch, step by step

Project mention: Insights from Finetuning LLMs for Classification Tasks | news.ycombinator.com | 2024-04-28

haystack

55 13,711 9.9 Python

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Project mention: Haystack DB – 10x faster than FAISS with binary embeddings by default | news.ycombinator.com | 2024-04-28

I was confused for a bit but there is no relation to https://haystack.deepset.ai/

DeepLearningExamples

7 12,642 6.1 Jupyter Notebook

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
MOSS

4 11,819 8.5 Python

An open-source tool-augmented conversational language model from Fudan University

Project mention: Has anyone tried fine tuning on a dataset of complex tasks that require tool use? | /r/LocalLLaMA | 2023-05-05

FinGPT

11 11,485 9.5 Jupyter Notebook

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

Project mention: GPT-4, without specialized training, beat a GPT-3.5 class model that cost $10B | news.ycombinator.com | 2024-03-24

There is also the open source FinGPT, that is claimed to beat GPT4 in some benchmarks at a fine tuning cost of $17.25.
https://github.com/AI4Finance-Foundation/FinGPT

Qwen

5 11,064 9.4 Python

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Project mention: What the heck is so great about this model? | /r/SillyTavernAI | 2023-12-07

Qwen: https://github.com/QwenLM/Qwen

ml-engineering

9 9,753 9.7 Python

Machine Learning Engineering Open Book

Project mention: Accelerators | news.ycombinator.com | 2024-02-22

ggml

69 9,725 9.8 C

Tensor library for machine learning

Project mention: LLMs on your local Computer (Part 1) | dev.to | 2024-03-11

git clone https://github.com/ggerganov/ggml cd ggml mkdir build cd build cmake .. make -j4 gpt-j ../examples/gpt-j/download-ggml-model.sh 6B

FlexGen

39 9,007 3.0 Python

Running large language models on a single GPU for throughput-oriented scenarios.

Project mention: Run 70B LLM Inference on a Single 4GB GPU with This New Technique | news.ycombinator.com | 2023-12-03

Awesome-Multimodal-Large-Language-Models

2 8,991 9.7

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

Project mention: Don't we need a leaderboard for visual models? | /r/LocalLLaMA | 2023-12-06

There is this one: https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models/tree/Evaluation As well as a leaderboard from OpenCompass (probably outdated): https://mmbench.opencompass.org.cn/leaderboard

LLMSurvey

3 8,825 7.9 Python

The official GitHub page for the survey paper "A Survey of Large Language Models".

Project mention: Ask HN: Textbook Regarding LLMs | news.ycombinator.com | 2024-03-23

Here’s another one - it’s older but has some interesting charts and graphs.
https://arxiv.org/abs/2303.18223

petals

98 8,684 8.3 Python

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Project mention: Mistral Large | news.ycombinator.com | 2024-02-26

So how long until we can do an open source Mistral Large?
We could make a start on Petals or some other open source distributed training network cluster possibly?
[0] https://petals.dev/

nebuly

105 8,363 8.4 Python

The user analytics platform for LLMs

Project mention: Nebuly – The LLM Analytics Platform | news.ycombinator.com | 2023-10-07

deeplake

13 7,708 9.8 Python

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25

Yi

9 7,141 9.4 Python

A series of large language models trained from scratch by developers @01-ai

Project mention: Yi: Open Foundation Models by 01.ai | news.ycombinator.com | 2024-03-10

The model license:
https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMEN...
1) Your use of the Yi Series Models must comply with the Laws and Regulations as

txtai

356 6,990 9.3 Python

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Project mention: Show HN: FileKitty – Combine and label text files for LLM prompt contexts | news.ycombinator.com | 2024-05-01

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

large-language-models related posts

Show HN: Hacker Search – A semantic search engine for Hacker News

3 projects | news.ycombinator.com | 2 May 2024
ChatGPT provides false information about people, and OpenAI can't correct it

1 project | news.ycombinator.com | 29 Apr 2024
Insights from Finetuning LLMs for Classification Tasks

1 project | news.ycombinator.com | 28 Apr 2024
Financial Market Applications of LLMs

1 project | news.ycombinator.com | 20 Apr 2024
Implementation for Mini-Gemini

1 project | news.ycombinator.com | 17 Apr 2024
News DataStax just bought our startup Langflow

1 project | news.ycombinator.com | 4 Apr 2024
Show HN: I made a library for LLM prompt injection/exploit/jailbreak detection

1 project | news.ycombinator.com | 3 Apr 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source large-language-model projects? This list will help you:

	Project	Stars
1	gpt_academic	55,872
2	llm-course	29,169
3	Flowise	24,074
4	LLaMA-Factory	20,248
5	Chinese-LLaMA-Alpaca	17,348
6	langflow	17,467
7	ChatGLM2-6B	15,495
8	LLMs-from-scratch	14,142
9	haystack	13,711
10	DeepLearningExamples	12,642
11	MOSS	11,819
12	FinGPT	11,485
13	Qwen	11,064
14	ml-engineering	9,753
15	ggml	9,725
16	FlexGen	9,007
17	Awesome-Multimodal-Large-Language-Models	8,991
18	LLMSurvey	8,825
19	petals	8,684
20	nebuly	8,363
21	deeplake	7,708
22	Yi	7,141
23	txtai	6,990