OpenPrompt
transformers
Our great sponsors
OpenPrompt | transformers | |
---|---|---|
1 | 173 | |
4,141 | 124,557 | |
1.9% | 2.7% | |
4.4 | 10.0 | |
3 months ago | about 17 hours ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
OpenPrompt
transformers
-
AI enthusiasm #6 - Finetune any LLM you want๐ก
Most of this tutorial is based on Hugging Face course about Transformers and on Niels Rogge's Transformers tutorials: make sure to check their work and give them a star on GitHub, if you please โค๏ธ
-
Schedule-Free Learning โ A New Way to Train
* Superconvergence + LR range finder + Fast AI's Ranger21 optimizer was the goto optimizer for CNNs, and worked fabulously well, but on transformers, the learning rate range finder sadi 1e-3 was the best, whilst 1e-5 was better. However, the 1 cycle learning rate stuck. https://github.com/huggingface/transformers/issues/16013
-
Gemma doesn't suck anymore โ 8 bug fixes
Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)
The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285
My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402
- HuggingFace Transformers: Qwen2
- HuggingFace Transformers Release v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2
- HuggingFace: Support for the Mixtral Moe
-
Paris-Based Startup and OpenAI Competitor Mistral AI Valued at $2B
If you want to tinker with the architecture Hugging Face has a FOSS implementation in transformers: https://github.com/huggingface/transformers/blob/main/src/tr...
If you want to reproduce the training pipeline, you couldn't do that even if you wanted to because you don't have access to thousands of A100s.
-
Fail to reproduce the same evaluation metrics score during inference.
I am aware that using mixed precision reduces the stability of weight and there will be little consistency but don't expect it to be this much. I have attached the graph of evaluation metrics. If someone can give me some insight into this issue, that would be great.
-
[D] What is a good way to maintain code readability and code quality while scaling up complexity in libraries like Hugging Face?
In transformers, they tried really hard to have a single function or method to deal with both self and cross attention mechanisms, masking, positional and relative encodings, interpolation etc. While it allows a user to use the same function/method for any model, it has led to severe parameter bloat. Just compare the original implementation of llama by FAIR with the implementation by HF to get an idea.
-
Mixtral-7b-8expert working in Oobabooga (unquantized multi-gpu)
pip install git+https://github.com/huggingface/transformers.git@main
What are some alternatives?
autonlp - ๐ค AutoNLP: train state-of-the-art natural language processing models and deploy them in a scalable environment automatically
fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
clip-as-service - ๐ Scalable embedding, reasoning, ranking for images and sentences with CLIP
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
camel_tools - A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
llama - Inference code for Llama models
nlp-recipes - Natural Language Processing Best Practices & Examples
transformer-pytorch - Transformer: PyTorch Implementation of "Attention Is All You Need"
thinc - ๐ฎ A refreshing functional take on deep learning, compatible with your favorite libraries
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
chappie.ai - Generalized AI to perform a multitude of tasks written in python3
huggingface_hub - The official Python client for the Huggingface Hub.