Top 16 finetuning Open-Source Projects

unsloth

15 8,974 9.4 Python

Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

Project mention: Ask HN: Most efficient way to fine-tune an LLM in 2024? | news.ycombinator.com | 2024-04-04

Gemma 7b is 2.4x faster than HF + FA2.
Check out https://github.com/unslothai/unsloth for full benchmarks!

FLAML

9 3,690 7.9 Jupyter Notebook

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.

Project mention: AutoGen: Enabling Next-Gen GPT-X Applications | news.ycombinator.com | 2023-08-22

I really like the simplicity of this framework, and they hit on a lot of common problems found in other agent-based frameworks. Most intrigued by the RAG improvements.
Seems like Microsoft was frustrated with the pace of movement in this space and the shitty results of agents (which admittedly kept my interest turned away from agents for the last few months). I'm interested again because it makes practical sense, and from looking at the example notebooks, seems fairly easy to integrate into existing applications.
Maybe this is the 'low code' approach that might actually work, and bridge together engineering and non-engineering resources.
This example was what caught my eye: https://github.com/microsoft/FLAML/blob/main/notebook/autoge...

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
h2o-llmstudio

13 3,614 9.3 Python

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/

Project mention: Paid dev gig: develop a basic LLM PEFT finetuning utility | /r/LocalLLaMA | 2023-06-02

learn2learn

2 2,552 5.3 Python

A PyTorch Library for Meta-learning Research
xTuring

31 2,525 8.4 Python

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

Project mention: I'm developing an open-source AI tool called xTuring, enabling anyone to construct a Language Model with just 5 lines of code. I'd love to hear your thoughts! | /r/machinelearningnews | 2023-09-07

Explore the project on GitHub here.

finetuner

36 1,432 5.5 Python

:dart: Task-oriented embedding tuning for BERT, CLIP, etc.
DB-GPT-Hub

1 1,069 9.3 Python

A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL

Project mention: Show HN: Improve Text-to-SQL Accuracy with LLM | news.ycombinator.com | 2023-07-10

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Awesome-Text2SQL

4 1,060 8.1

Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL、Text2API、Text2Vis and more.

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

LLM-Finetuning-Toolkit

1 676 9.6 Python

Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.

Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

godot-dodo

16 511 3.1 Python

Finetuning large language models for GDScript generation.

Project mention: Meta: Code Llama, an AI Tool for Coding | news.ycombinator.com | 2023-08-24

If you can find a large body of good, permissively licensed example code, you can finetune an LLM on it!
There was a similar attempt for Godot script trained a few months ago, and its reportedly pretty good:
https://github.com/minosvasilias/godot-dodo
I think more attempts havent been made because base llama is not that great at coding in general, relative to its other strengths, and stuff like Starcoder has flown under the radar.

finetune-gpt2xl

9 422 0.0 Python

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
llm-toys

2 115 7.2 Python

Small(7B and below) finetuned LLMs for a diverse set of useful tasks

Project mention: How to fine tune llama2? | /r/LocalLLaMA | 2023-07-27

You can use the train script here https://github.com/kuutsav/llm-toys/blob/main/llm_toys/train.py. The readme contains a sample training command.

praetor-data

5 63 6.4 Python

Praetor is a lightweight finetuning data and prompt management tool
nanoChatGPT

3 49 9.3 Python

nanogpt turned into a chat model (by VatsaDev)

Project mention: A full tutorial on turning GPT-2 into a conversational AI | news.ycombinator.com | 2023-08-31

Hi, Vatsa here, this is tutorial on turning GPT-2 into a conversational bot, it was a fun project, and I hope you like it it!
github -> https://github.com/VatsaDev/nanoChatGPT

Zicklein

3 32 5.0 Python

Finetuning instruct-LLaMA on german datasets.

Project mention: Zicklein - a German 🇩🇪 finetuned LlaMA-7b base model (OS) | /r/LocalLLaMA | 2023-05-22

reddit-finetune-frontend

1 1 7.1 Python

A webapp used for finetuning an openai model with data scraped from reddit
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

finetuning related posts

Ask HN: Most efficient way to fine-tune an LLM in 2024?

6 projects | news.ycombinator.com | 4 Apr 2024
AMD ROCm Software Blogs

4 projects | news.ycombinator.com | 23 Feb 2024
Show HN: We got fine-tuning Mistral-7B to not suck

4 projects | news.ycombinator.com | 7 Feb 2024
Mistral 7B Fine-Tune Optimized

2 projects | news.ycombinator.com | 20 Dec 2023
Has anyone tried out the ASPEN-Framework for LoRA Fine-Tuning yet and can share their experience?

2 projects | /r/LocalLLaMA | 6 Dec 2023
Show HN: 80% faster, 50% less memory, 0% loss of accuracy Llama finetuning

1 project | /r/hypeurls | 5 Dec 2023
80% faster, 50% less memory, 0% loss of accuracy Llama finetuning

6 projects | news.ycombinator.com | 1 Dec 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 15 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source finetuning projects? This list will help you:

	Project	Stars
1	unsloth	8,974
2	FLAML	3,690
3	h2o-llmstudio	3,614
4	learn2learn	2,552
5	xTuring	2,525
6	finetuner	1,432
7	DB-GPT-Hub	1,069
8	Awesome-Text2SQL	1,060
9	LLM-Finetuning-Toolkit	676
10	godot-dodo	511
11	finetune-gpt2xl	422
12	llm-toys	115
13	praetor-data	63
14	nanoChatGPT	49
15	Zicklein	32
16	reddit-finetune-frontend	1