Top 23 language-model Open-Source Projects

transformers

175 125,021 10.0 Python

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Project mention: Maxtext: A simple, performant and scalable Jax LLM | news.ycombinator.com | 2024-04-23

Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py

gpt4free

44 57,133 9.9 Python

The official gpt4free repository | various collection of powerful language models

Project mention: gpt4-openai-api VS gpt4free - a user suggested alternative | libhunt.com/r/gpt4-openai-api | 2024-01-04

I cant install

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Prompt-Engineering-Guide

82 43,711 9.7 MDX

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

Project mention: FLaNK AI - 15 April 2024 | dev.to | 2024-04-15

Open-Assistant

329 36,622 9.1 Python

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Project mention: Best open source AI chatbot alternative? | /r/opensource | 2023-12-08

For open assistant, the code: https://github.com/LAION-AI/Open-Assistant/tree/main/inference

stanford_alpaca

108 28,761 2.0 Python

Code and documentation to train Stanford's Alpaca models, and generate the data.

Project mention: How Open is Generative AI? Part 2 | dev.to | 2023-12-19

Alpaca is an instruction-oriented LLM derived from LLaMA, enhanced by Stanford researchers with a dataset of 52,000 examples of following instructions, sourced from OpenAI’s InstructGPT through the self-instruct method. The extensive self-instruct dataset, details of data generation, and the model refinement code were publicly disclosed. This model complies with the licensing requirements of its base model. Due to the utilization of InstructGPT for data generation, it also adheres to OpenAI’s usage terms, which prohibit the creation of models competing with OpenAI. This illustrates how dataset restrictions can indirectly affect the resulting fine-tuned model.

LLaMA-Factory

2 17,050 9.9 Python

Unify Efficient Fine-Tuning of 100+ LLMs

Project mention: Show HN: GPU Prices on eBay | news.ycombinator.com | 2024-02-23

Depends what model you want to train, and how well you want your computer to keep working while you're doing it.
If you're interested in large language models there's a table of vram requirements for fine-tuning at [1] which says you could do the most basic type of fine-tuning on a 7B parameter model with 8GB VRAM.
You'll find that training takes quite a long time, and as a lot of the GPU power is going on training, your computer's responsiveness will suffer - even basic things like scrolling in your web browser or changing tabs uses the GPU, after all.
Spend a bit more and you'll probably have a better time.
[1] https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...

mlc-llm

89 16,774 9.9 Python

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
StableLM

43 15,853 5.0 Jupyter Notebook

StableLM: Stability AI Language Models

Project mention: The Era of 1-bit LLMs: ternary parameters for cost-effective computing | news.ycombinator.com | 2024-02-28

https://github.com/Stability-AI/StableLM?tab=readme-ov-file#...

haystack

54 13,633 9.9 Python

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Project mention: Release Radar • March 2024 Edition | dev.to | 2024-04-07

View on GitHub

RWKV-LM

84 11,619 8.8 Python

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Project mention: Do LLMs need a context window? | news.ycombinator.com | 2023-12-25

https://github.com/BlinkDL/RWKV-LM#rwkv-discord-httpsdiscord... lists a number of implementations of various versions of RWKV.
https://github.com/BlinkDL/RWKV-LM#rwkv-parallelizable-rnn-w... :
> RWKV: Parallelizable RNN with Transformer-level LLM Performance (pronounced as "RwaKuv", from 4 major params: R W K V)
> RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode.
> So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding (using the final hidden state).
> "Our latest version is RWKV-6,*

ChatRWKV

28 9,276 8.3 Python

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

Project mention: People who've used RWKV, whats your wishlist for it? | /r/LocalLLaMA | 2023-12-09

web-llm

42 9,018 9.0 TypeScript

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

Project mention: What stack would you recommend to build a LLM app in React without a backend? | /r/react | 2023-12-08

LoRA

34 9,046 5.4 Python

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Project mention: DECT NR+: A technical dive into non-cellular 5G | news.ycombinator.com | 2024-04-02

This seems to be an order of magnitude better than LoRa (https://lora-alliance.org/ not https://arxiv.org/abs/2106.09685). LoRa doesn't have all the features this one does like OFDM, TDM, FDM, and HARQ. I didn't know there's spectrum dedicated for DECT use.

tokenizers

8 8,395 8.5 Rust

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Project mention: HF Transfer: Speed up file transfers | /r/rust | 2023-07-07

Hugging Face seems to like Rust. They also wrote Tokenizers in Rust.

open_clip

27 8,391 8.4 Jupyter Notebook

An open source implementation of CLIP.

Project mention: A History of CLIP Model Training Data Advances | dev.to | 2024-03-13

While OpenAI’s CLIP model has garnered a lot of attention, it is far from the only game in town—and far from the best! On the OpenCLIP leaderboard, for instance, the largest and most capable CLIP model from OpenAI ranks just 41st(!) in its average zero-shot accuracy across 38 datasets.

LMFlow

10 8,000 9.6 Python

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Project mention: Your weekly machine learning digest | /r/learnmachinelearning | 2023-07-03

speechbrain

26 7,869 9.8 Python

A PyTorch-based Speech Toolkit

Project mention: SpeechBrain 1.0: A free and open-source AI toolkit for all things speech | news.ycombinator.com | 2024-02-28

ai

17 7,726 9.8 TypeScript

Build AI-powered applications with React, Svelte, Vue, and Solid

Project mention: Building a SQL Expert Bot: A Step-by-Step Guide with Vercel AI SDK and OpenAI API | dev.to | 2024-03-05

The Vercel AI SDK is built for OpenAI APIs and includes a range of tools for utilizing OpenAI APIs.

txtai

354 6,953 9.3 Python

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Project mention: Build knowledge graphs with LLM-driven entity extraction | dev.to | 2024-02-21

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

gpt-neox

52 6,569 8.9 Python

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Project mention: FLaNK Stack 26 February 2024 | dev.to | 2024-02-26

BERT-pytorch

1 5,988 0.0 Python

Google AI 2018 BERT pytorch implementation
CogVLM

16 4,968 9.0 Python

a state-of-the-art-level open visual language model | 多模态预训练模型

Project mention: Mixtral: Mixture of Experts | news.ycombinator.com | 2024-01-08

CogVLM is very good in my (brief) testing: https://github.com/THUDM/CogVLM
The model weights seem to be under a non-commercial license, not true open source, but it is "open access" as you requested.

lm-evaluation-harness

34 4,957 9.9 Python

A framework for few-shot evaluation of language models.

Project mention: Mistral AI Launches New 8x22B Moe Model | news.ycombinator.com | 2024-04-09

The easiest is to use vllm (https://github.com/vllm-project/vllm) to run it on a Couple of A100's, and you can benchmark this using this library (https://github.com/EleutherAI/lm-evaluation-harness)

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

language-model related posts

CatLIP: Clip Vision Accuracy with 2.7x Faster Pre-Training on Web-Scale Data
1 project | news.ycombinator.com | 25 Apr 2024
Multimodal Embeddings for JavaScript, Swift, and Python
1 project | news.ycombinator.com | 25 Apr 2024
Mistral AI Launches New 8x22B Moe Model
4 projects | news.ycombinator.com | 9 Apr 2024
Schedule-Free Learning – A New Way to Train
3 projects | news.ycombinator.com | 6 Apr 2024
DECT NR+: A technical dive into non-cellular 5G
1 project | news.ycombinator.com | 2 Apr 2024
Prompt Engineering Guide
1 project | news.ycombinator.com | 30 Mar 2024
Show HN: UForm v2 Featuring Multimodal Matryoshka, Multimodal DPO, and ONNX
1 project | news.ycombinator.com | 28 Mar 2024
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source language-model projects? This list will help you:

	Project	Stars
1	transformers	125,021
2	gpt4free	57,133
3	Prompt-Engineering-Guide	43,711
4	Open-Assistant	36,622
5	stanford_alpaca	28,761
6	LLaMA-Factory	17,050
7	mlc-llm	16,774
8	StableLM	15,853
9	haystack	13,633
10	RWKV-LM	11,619
11	ChatRWKV	9,276
12	web-llm	9,018
13	LoRA	9,046
14	tokenizers	8,395
15	open_clip	8,391
16	LMFlow	8,000
17	speechbrain	7,869
18	ai	7,726
19	txtai	6,953
20	gpt-neox	6,569
21	BERT-pytorch	5,988
22	CogVLM	4,968
23	lm-evaluation-harness	4,957