llm-foundry vs WizardLM

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llm-foundry		WizardLM
	Project
37	Mentions	38
3,730	Stars	7,531
4.0%	Growth	-
9.7	Activity	9.4
4 days ago	Latest Commit	8 months ago
Python	Language	Python
Apache License 2.0	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

llm-foundry

Posts with mentions or reviews of llm-foundry. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-05.

Fine Tuning Mistral 7B on Magic the Gathering Draft
4 projects | news.ycombinator.com | 5 Dec 2023

Related comment from gwern: https://news.ycombinator.com/item?id=38438859
Also - why qlora rather than a full finetune? Using LambdaLabs, It'd cost roughly the same as your quote. Cheaper I think if you're willing to gamble with fp8: https://github.com/mosaicml/llm-foundry/tree/main/scripts/tr.... And fewer hyperparameters to tune as well
Consortium launched to build the largest open LLM
1 project | news.ycombinator.com | 18 Oct 2023

Traditionally, training runs can "explode" and fail, but there are methods to incrementally back them up and resume when that happens, see https://www.mosaicml.com/blog/mpt-7b
Applying All Recent Innovations To Train a Code Model
2 projects | dev.to | 11 Aug 2023

MosaicML released the MPT-7B model, which has a context of 60k tokens, thanks to the ALiBi position encoding.
Fine Tuning Language Models
1 project | news.ycombinator.com | 3 Jul 2023

Most AI runners just ignore licensing and run LLaMA finetunes.
But if you want to avoid the non commercial LLaMA license, you have 3 good options for a base model.
- OpenLlama 13B
- MPT 30B
- Falcon 40B
Of these, Falcon 40B is very difficult to run (slow in 4 bit, basically requires a professional GPU, no good cpu offloading yet).
OpenLLaMA 13B only supports a context size of 2048 as of today... But that could change soon.
So you probably want MPT instruct 30B, specifically this one:
https://huggingface.co/TheBloke/mpt-30B-instruct-GGML
As the page says, you can try it out on a decent PC of your own with the OpenCL build of KoboldCPP. Change it to "instruct" mode, use the template on the page, offload as many layers as you can to your PC's dGPU, and run it in instruct mode. It may already work for your summarization needs.
If not, you can finetune it with MPT's code and summarization d
https://github.com/mosaicml/llm-foundry
Or train OpenLLaMA 13B with SuperHOT + summarization data using QLORA.
Finetune MPT-30B using QLORA
2 projects | /r/LocalLLaMA | 3 Jul 2023

BTW. they finally merged a MPT patch to work with lora: https://github.com/mosaicml/llm-foundry/issues/304
[N] Meet MPT-30B: A Fully OpenSouce LLM that Outperforms GPT-3 - Dr. Mandar Karhade, MD. PhD.
2 projects | /r/MachineLearning | 1 Jul 2023
MPT-30B QLoRA on 24 GB VRAM
2 projects | /r/LocalLLaMA | 30 Jun 2023

Did you run into this error while using qlora on MPT30b?: https://github.com/mosaicml/llm-foundry/issues/413
MosaicML Agrees to Join Databricks to Power Generative AI for All
3 projects | /r/LocalLLaMA | 26 Jun 2023

Yes? Their github is under Apache, their base model is under apache, the training data is not theirs, and they provide scripts how to convert it for the pretrain step. They have scripts for pretraining and finetuning as well. Basically for everything.
Best model for commercial use?
1 project | /r/LocalLLaMA | 26 Jun 2023

mosaicml/llm-foundry: LLM training code for MosaicML foundation models (github.com)
MosaicML launches MPT-30B: A new open-source model that outperforms GPT-3
1 project | /r/mlwires | 25 Jun 2023

MosaicML, a company that provides a platform for training and deploying large language models (LLMs), has recently released its second open-source foundation model called MPT-30B. The model is part of the MosaicML Foundation Series and comes after the smaller MPT-7B model that was launched in May 2023.

WizardLM

Posts with mentions or reviews of WizardLM. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-22.

FLaNK AI-April 22, 2024
28 projects | dev.to | 22 Apr 2024
Refact LLM: New 1.6B code model reaches 32% HumanEval and is SOTA for the size
4 projects | news.ycombinator.com | 4 Sep 2023

This is interesting work, and a good contribution, but there is no need to mislead people.
[1] https://github.com/nlpxucan/WizardLM
Continue with LocalAI: An alternative to GitHub's Copilot that runs everything locally
4 projects | /r/selfhosted | 30 Aug 2023

If you pair this with the latest WizardCoder models, which have a fairly better performance than the standard Salesforce Codegen2 and Codegen2.5, you have a pretty solid alternative to GitHub Copilot that runs completely locally.
WizardCoder context?
1 project | /r/LocalLLaMA | 20 Aug 2023
The world's most-powerful AI model suddenly got 'lazier' and 'dumber.' A radical redesign of OpenAI's GPT-4 could be behind the decline in performance.
3 projects | /r/ChatGPT | 13 Jul 2023
Official WizardLM-13B-V1.1 Released! Train with Only 1K Data! Can Achieve 86.32% on AlpacaEval!
5 projects | /r/LocalLLaMA | 7 Jul 2023

(We will update the demo links in our github.)
GPT-4 API general availability
15 projects | news.ycombinator.com | 6 Jul 2023

In terms of speed, we're talking about 140t/s for 7B models, and 40t/s for 33B models on a 3090/4090 now.[1] (1 token ~= 0.75 word) It's quite zippy. llama.cpp performs close on Nvidia GPUs now (but they don't have a handy chart) and you can get decent performance on 13B models on M1/M2 Macs.
You can take a look at a list of evals here: https://llm-tracker.info/books/evals/page/list-of-evals - for general usage, I think home-rolled evals like llm-jeopardy [2] and local-llm-comparison [3] by hobbyists are more useful than most of the benchmark rankings.
That being said, personally I mostly use GPT-4 for code assistance to that's what I'm most interested in, and the latest code assistants are scoring quite well: https://github.com/abacaj/code-eval - a recent replit-3b fine tune the human-eval results for open models (as a point of reference, GPT-3.5 gets 60.4 on pass@1 and 68.9 on pass@10 [4]) - I've only just started playing around with it since replit model tooling is not as good as llamas (doc here: https://llm-tracker.info/books/howto-guides/page/replit-mode...).
I'm interested in potentially applying reflexion or some of the other techniques that have been tried to even further increase coding abilities. (InterCode in particular has caught my eye https://intercode-benchmark.github.io/)
[1] https://github.com/turboderp/exllama#results-so-far
[2] https://github.com/aigoopy/llm-jeopardy
[3] https://github.com/Troyanovsky/Local-LLM-comparison/tree/mai...
[4] https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder
WizardLM-13B-V1.0-Uncensored
1 project | /r/VenusAI_Official | 1 Jul 2023

You talking about this? https://github.com/nlpxucan/WizardLM
What 7b llm to use
1 project | /r/LocalLLaMA | 23 Jun 2023

The smallest model that is close to competent at code is WizardCoder 15B.. https://github.com/nlpxucan/WizardLM/
16-Jun-2023
1 project | /r/dailyainews | 15 Jun 2023

WizardCoder: Empowering Code Large Language Models with Evol-Instruct (https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder)

What are some alternatives?

When comparing llm-foundry and WizardLM you can also consider the following projects:

qlora - QLoRA: Efficient Finetuning of Quantized LLMs

private-gpt - Interact with your documents using the power of GPT, 100% privately, no data leaks

basaran - Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

llm-humaneval-benchmarks

RasaGPT - 💬 RasaGPT is the first headless LLM chatbot platform built on top of Rasa and Langchain. Built w/ Rasa, FastAPI, Langchain, LlamaIndex, SQLModel, pgvector, ngrok, telegram

exllama - A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

LMFlow - An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

airoboros - Customizable implementation of the self-instruct paper.

prompt-engineering - ChatGPT Prompt Engineering for Developers - deeplearning.ai

promptfoo - Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.

llm-numbers - Numbers every LLM developer should know

can-ai-code - Self-evaluating interview for AI coders

llm-foundry vs qlora WizardLM vs private-gpt llm-foundry vs basaran WizardLM vs llm-humaneval-benchmarks llm-foundry vs RasaGPT WizardLM vs exllama llm-foundry vs LMFlow WizardLM vs airoboros llm-foundry vs prompt-engineering WizardLM vs promptfoo llm-foundry vs llm-numbers WizardLM vs can-ai-code

Compare llm-foundry vs WizardLM and see what are their differences.

llm-foundry

WizardLM

llm-foundry

WizardLM

What are some alternatives?