chain-of-thought-hub
airoboros
chain-of-thought-hub | airoboros | |
---|---|---|
10 | 8 | |
2,371 | 948 | |
- | - | |
6.9 | 8.7 | |
10 days ago | about 2 months ago | |
Jupyter Notebook | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
chain-of-thought-hub
- Chain-Of-Thought Hub: Measuring LLMs' Reasoning Performance
-
All Model Leaderboards (that I know)
Chain-of-Thought Hub https://github.com/FranxYao/chain-of-thought-hub - these are mostly gathered although Yao Fu, the author is working on specific CoT runs
- It looks likely that the MMLU score on Hugginface's LLM leaderboard is wrong after all.
-
(2/2) May 2023
Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance (https://github.com/FranxYao/chain-of-thought-hub)
-
Ask HN: Is it just me or GPT-4's quality has significantly deteriorated lately?
https://github.com/FranxYao/chain-of-thought-hub
- [N] Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance
- Chain-of-Thought Hub: Measuring LLMs' Reasoning Performance
airoboros
- TinyLlama project aims to pretrain a 1.1B Llama model on 3T tokens
- Airoboros: Customizable implementation of the self-instruct paper
-
airoboros (tool) overhaul
Just wanted to drop a note that I overhauled the airoboros tool not the models to have most of the prompts I've been using to build the datasets, plus a couple extras.
-
(2/2) May 2023
airoboros: using large language models to fine-tune large language models (https://github.com/jondurbin/airoboros)
-
Airoboros [7B/13B]
This is a fine-tuned LlaMa model, using completely synthetic training data created by https://github.com/jondurbin/airoboros
-
airobors-13b - 98% eval vs gpt-3.5-turbo
I used airoboros, a python tool I wrote, to generate the synthetic instruction response pairs, and included a jailbreak prompt to attempt to bypass OpenAI censorship. This is the only dataset used to fine-tune the model.
-
[P] airoboros 7b - instruction tuned on 100k synthetic instruction/responses
This is a 7b parameter, fine-tuned on 100k synthetic instruction/response pairs generated by gpt-3.5-turbo using my version of self-instruct airoboros
-
[P] airoboros: a rewrite of self-instruct/alpaca synthetic prompt generation
GitHub Repo
What are some alternatives?
DB-GPT - AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
WizardLM - Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
llm-leaderboard - A joint community effort to create one central leaderboard for LLMs.
TinyLlama - The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
tree-of-thoughts - Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
WizardVicunaLM - LLM that combines the principles of wizardLM and vicunaLM
llm-humaneval-benchmarks
datablations - Scaling Data-Constrained Language Models
GirlfriendGPT - Girlfriend GPT is a Python project to build your own AI girlfriend using ChatGPT4.0
gorilla - Gorilla: An API store for LLMs
gptqlora - GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ