torchscale
RetNet
torchscale | RetNet | |
---|---|---|
2 | 2 | |
2,927 | 1,125 | |
1.6% | - | |
7.2 | 7.2 | |
25 days ago | 7 months ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
torchscale
-
Retentive Network: A Successor to Transformer Implemented in PyTorch
A retnet commit has now appeared in Microsoft's torchscale repo:
https://github.com/microsoft/torchscale/commit/bf65397b26469...
- [R] TorchScale: Transformers at Scale - Microsoft 2022 Shuming Ma et al - Improves modeling generality and capability, as well as training stability and efficiency.
RetNet
What are some alternatives?
towhee - Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
fairscale - PyTorch extensions for high performance and large scale training.
bertviz - BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
extreme-bert - ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT”.
xformers - Hackable and optimized Transformers building blocks, supporting a composable construction.
glami-1m - The largest multilingual image-text classification dataset. It contains fashion products.
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Multimodal-GPT - Multimodal-GPT