learning-machine
transformers
learning-machine | transformers | |
---|---|---|
10 | 176 | |
486 | 125,369 | |
- | 1.7% | |
0.0 | 10.0 | |
3 months ago | 3 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
learning-machine
- Show HN: ML Questions Answered
-
Show HN: A Machine Learning Book: Learn ML by Reading Answers, Like So
Hi HN, original poster here!
We made a compilation (book) of questions that we got from 1300+ students from this course [1].
We believe that stackoverflow-like Q/A scheme is best for learning, so we made this.
Project Repo: https://github.com/rentruewang/learning-machine
Website: https://rentruewang.github.io/learning-machine
The website is hosted on GitHub, automatically built from the repo by github actions.
We are lucky to get some feedbacks on reddit here [2], here [3], and here [4], and have made changes accordingly. We really want to know what you guys on HN think. Any suggestions are welcome!
[1] https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.html
[2] https://www.reddit.com/r/datascience/comments/oz7xab/open_so...
[3] https://www.reddit.com/r/learnmachinelearning/comments/oz78n...
[4] https://www.reddit.com/r/MachineLearning/comments/oz7p26/p_o...
- Show HN: A Machine Learning Book: Learn ML by Reading Answers, Like SO
-
Open Sourced a Machine Learning Book: Learn Machine Learning By Reading Answers, Just Like StackOverflow
[Project Repo](https://github.com/rentruewang/learning-machine)
- Learn machine learning by reading answers to questions, like stack overflow.
-
Learn ML by answers to other people's questions!
Website Project Repo.
-
Learn machine learning by reading someone else's questions answered.
We are working on a compilation of frequently asked questions! We aim to answer beginner-unfriendly questions in a simple, and strait-forward way so that it will never be asked again. Hope you find this helpful! Website, Project Repo.
- Show HN: A handbook to help students learn machine learning
transformers
-
AI enthusiasm #9 - A multilingual chatbot📣🈸
transformers is a package by Hugging Face, that helps you interact with models on HF Hub (GitHub)
-
Maxtext: A simple, performant and scalable Jax LLM
Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py
-
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
The HuggingFace transformers library already has support for a similar method called prompt lookup decoding that uses the existing context to generate an ngram model: https://github.com/huggingface/transformers/issues/27722
I don't think it would be that hard to switch it out for a pretrained ngram model.
-
AI enthusiasm #6 - Finetune any LLM you want💡
Most of this tutorial is based on Hugging Face course about Transformers and on Niels Rogge's Transformers tutorials: make sure to check their work and give them a star on GitHub, if you please ❤️
-
Schedule-Free Learning – A New Way to Train
* Superconvergence + LR range finder + Fast AI's Ranger21 optimizer was the goto optimizer for CNNs, and worked fabulously well, but on transformers, the learning rate range finder sadi 1e-3 was the best, whilst 1e-5 was better. However, the 1 cycle learning rate stuck. https://github.com/huggingface/transformers/issues/16013
-
Gemma doesn't suck anymore – 8 bug fixes
Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)
The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285
My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402
- HuggingFace Transformers: Qwen2
- HuggingFace Transformers Release v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2
- HuggingFace: Support for the Mixtral Moe
-
Paris-Based Startup and OpenAI Competitor Mistral AI Valued at $2B
If you want to tinker with the architecture Hugging Face has a FOSS implementation in transformers: https://github.com/huggingface/transformers/blob/main/src/tr...
If you want to reproduce the training pipeline, you couldn't do that even if you wanted to because you don't have access to thousands of A100s.
What are some alternatives?
fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
llama - Inference code for Llama models
transformer-pytorch - Transformer: PyTorch Implementation of "Attention Is All You Need"
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
huggingface_hub - The official Python client for the Huggingface Hub.
OpenNMT-py - Open Source Neural Machine Translation and (Large) Language Models in PyTorch
sentencepiece - Unsupervised text tokenizer for Neural Network-based text generation.
Swin-Transformer-Tensorflow - Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)
faiss - A library for efficient similarity search and clustering of dense vectors.
KoboldAI-Client
gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.