EasyLM
transformers
EasyLM | transformers | |
---|---|---|
8 | 180 | |
2,247 | 126,170 | |
- | 2.3% | |
7.7 | 10.0 | |
5 months ago | 4 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
EasyLM
- Maxtext: A simple, performant and scalable Jax LLM
- How To Fine-Tune LLaMA, OpenLLaMA, And XGen, With JAX On A GPU Or A TPU
-
Open-sourced LLMs are adept at mimicking ChatGPT’s style but not its factuality. There exists a substantial capabilities gap, which requires better base LM.
Title: The False Promise of Imitating Proprietary LLLs Authors: Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine, Dawn Song Word Count: 3400 Average Reading Time: 18-20 minutes Source Code: https://github.com/young-geng/EasyLM Additional Links: https://huggingface.co/young-geng/koala-eval, https://huggingface.co/young-geng/koala
-
Paid dev gig: develop a basic LLM PEFT finetuning utility
Check out easyLM https://github.com/young-geng/EasyLM
-
OpenLLaMA Releases 7B/3B Checkpoints with 700B/600B Tokens
We release the weights in two formats: an EasyLM format to be use with our EasyLM framework, and a PyTorch format to be used with the Hugging Face transformers library.
-
OpenLLaMA: An Open Reproduction of LLaMA
I am quite new to this, I would like to get it running. Would the process roughly be:
1. Get a machine with decent GPU, probably rent cloud GPU.
2. On that machine download the weights/model/vocab files from https://huggingface.co/openlm-research/open_llama_7b_preview...
3. Install Anaconda. Clone https://github.com/young-geng/EasyLM/.
4. Install EasyLM:
conda env create -f scripts/gpu_environment.yml
- Koala: A Dialogue Model for Academic Research [Finetuned Llama-13B on a dataset generated by ChatGPT]
transformers
-
XLSTM: Extended Long Short-Term Memory
Fascinating work, very promising.
Can you summarise how the model in your paper differs from this one ?
https://github.com/huggingface/transformers/issues/27011
-
AI enthusiasm #9 - A multilingual chatbot📣🈸
transformers is a package by Hugging Face, that helps you interact with models on HF Hub (GitHub)
-
Maxtext: A simple, performant and scalable Jax LLM
Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py
-
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
The HuggingFace transformers library already has support for a similar method called prompt lookup decoding that uses the existing context to generate an ngram model: https://github.com/huggingface/transformers/issues/27722
I don't think it would be that hard to switch it out for a pretrained ngram model.
-
AI enthusiasm #6 - Finetune any LLM you want💡
Most of this tutorial is based on Hugging Face course about Transformers and on Niels Rogge's Transformers tutorials: make sure to check their work and give them a star on GitHub, if you please ❤️
-
Schedule-Free Learning – A New Way to Train
* Superconvergence + LR range finder + Fast AI's Ranger21 optimizer was the goto optimizer for CNNs, and worked fabulously well, but on transformers, the learning rate range finder sadi 1e-3 was the best, whilst 1e-5 was better. However, the 1 cycle learning rate stuck. https://github.com/huggingface/transformers/issues/16013
-
Gemma doesn't suck anymore – 8 bug fixes
Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)
The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285
My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402
- HuggingFace Transformers: Qwen2
- HuggingFace Transformers Release v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2
- HuggingFace: Support for the Mixtral Moe
What are some alternatives?
mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
camel - 🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society (NeruIPS'2023) https://www.camel-ai.org
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
Open-Llama - The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
llama - Inference code for Llama models
brev-cli - Connect your laptop to cloud computers. Follow to stay updated about our product
transformer-pytorch - Transformer: PyTorch Implementation of "Attention Is All You Need"
RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
modal-examples - Examples of programs built using Modal
huggingface_hub - The official Python client for the Huggingface Hub.