transformers
gpt-3-experiments
transformers | gpt-3-experiments | |
---|---|---|
176 | 6 | |
125,369 | 709 | |
1.7% | - | |
10.0 | 0.0 | |
about 18 hours ago | almost 4 years ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
transformers
-
AI enthusiasm #9 - A multilingual chatbot📣🈸
transformers is a package by Hugging Face, that helps you interact with models on HF Hub (GitHub)
-
Maxtext: A simple, performant and scalable Jax LLM
Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py
-
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
The HuggingFace transformers library already has support for a similar method called prompt lookup decoding that uses the existing context to generate an ngram model: https://github.com/huggingface/transformers/issues/27722
I don't think it would be that hard to switch it out for a pretrained ngram model.
-
AI enthusiasm #6 - Finetune any LLM you want💡
Most of this tutorial is based on Hugging Face course about Transformers and on Niels Rogge's Transformers tutorials: make sure to check their work and give them a star on GitHub, if you please ❤️
-
Schedule-Free Learning – A New Way to Train
* Superconvergence + LR range finder + Fast AI's Ranger21 optimizer was the goto optimizer for CNNs, and worked fabulously well, but on transformers, the learning rate range finder sadi 1e-3 was the best, whilst 1e-5 was better. However, the 1 cycle learning rate stuck. https://github.com/huggingface/transformers/issues/16013
-
Gemma doesn't suck anymore – 8 bug fixes
Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)
The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285
My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402
- HuggingFace Transformers: Qwen2
- HuggingFace Transformers Release v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2
- HuggingFace: Support for the Mixtral Moe
-
Paris-Based Startup and OpenAI Competitor Mistral AI Valued at $2B
If you want to tinker with the architecture Hugging Face has a FOSS implementation in transformers: https://github.com/huggingface/transformers/blob/main/src/tr...
If you want to reproduce the training pipeline, you couldn't do that even if you wanted to because you don't have access to thousands of A100s.
gpt-3-experiments
-
AI chatbots are not a replacement for search engines
The problem with ChatGPT as a replacement for Google is that it was not designed to produce accurate facts, and it shows. This model cut its teeth writing articles about the discovery of unicorns in the Andes[0] for goodness sake! It's a language model, and a very impressive one at that, but language is used to express falsehoods and fiction just as regularly as it is used to express truth.
This doesn't mean that it can't produce accurate facts, most of the time it does! But when it does produce nonsense, it does it in exactly the same tone of authority, so if you don't already know the answer you may well walk away believing an AI hallucination.
And the trouble is it doesn't really matter if everyone here thinks "well, I would follow up each request with research to verify the answer", because most people won't! This is like the Google answer extracts, which fairly frequently mislead by extracting out-of-context quotes, except that there's no way to get the original context and there may in fact be no original context! This makes follow-up research much more complicated than with Google and therefore unlikely to happen. If ChatGPT replaces Google, the amount of nonsense on the internet will get even worse, which is something that until 2022 I never thought was possible.
[0] https://github.com/minimaxir/gpt-3-experiments/blob/master/e...
- Artificial Intelligence writes
-
The Computers Are Getting Better at Writing
See also my experiments with GPT-3 on sane prompts, which have wildly varying quality even after generating them in bulk: https://github.com/minimaxir/gpt-3-experiments
Creative writing hasn't been one of the super-hyped use cases by OpenAI for the OpenAI API outside of AI Dungeon, surprisingly. For just random generation, the necessary curation can detract from the time-savings advantages. (as an aside, the API is also extremely expensive for long-form content to the point I'm not sure how the economics work for these startups even with charging monthly fees).
I'm more bullish on small bespoke models for a given use case, which is what I spend my time researching.
-
Does GPT-2 Know Your Phone Number?
Thanks, didn't twig onto the fact that you linked a subtree of the whole repo. Weird that even with the nonzero temp the AskReddit prompt went a bit loopy.
> https://github.com/minimaxir/gpt-3-experiments/blob/master/e...
Oh my goodness that is absurd in the most delightful way. Thanks for sharing that.
What are some alternatives?
fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
languagetool - Style and Grammar Checker for 25+ Languages
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
chatgpt-google-extension - A browser extension that enhance search engines with ChatGPT
llama - Inference code for Llama models
vim-LanguageTool - A vim plugin for the LanguageTool grammar checker
transformer-pytorch - Transformer: PyTorch Implementation of "Attention Is All You Need"
Gleemin - A Magic: the Gathering™ expert system
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
chatgpt-raycast - ChatGPT raycast extension
huggingface_hub - The official Python client for the Huggingface Hub.
THELEMA - My MSc thesis: a grammar induction system