YaLM-100B | SLIDE | |
---|---|---|
1 | 3 | |
0 | 475 | |
- | -0.4% | |
0.0 | 0.0 | |
almost 2 years ago | over 2 years ago | |
Python | ||
Apache License 2.0 | - |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
YaLM-100B
Posts with mentions or reviews of YaLM-100B.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-06-23.
-
Yandex opensources 100B parameter GPT-like model
I downloaded the weights and made a .torrent file (also a magnet link, see raw README.md). Can somebody else who downloaded the files as well doublecheck the checksums?
https://github.com/lostmsu/YaLM-100B/tree/Torrent
SLIDE
Posts with mentions or reviews of SLIDE.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-06-23.
-
Yandex opensources 100B parameter GPT-like model
That's pretty much what SLIDE [0] does. The driver was achieving performance parity with GPUs for CPU training, but presumably the same could apply to running inference on models too large to load into consumer GPU memory.
https://github.com/RUSH-LAB/SLIDE
- [R] CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers
- CPU-based algorithm trains deep neural nets up to 15 times faster than top GPU
What are some alternatives?
When comparing YaLM-100B and SLIDE you can also consider the following projects:
gpt-neox - An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
YaLM-100B - Pretrained language model with 100B parameters
lc0 - The rewritten engine, originally for tensorflow. Now all other backends have been ported here.
mesh-transformer-jax - Model parallel transformers in JAX and Haiku
goslide - SLIDE (Sub-LInear Deep learning Engine) written in Go
HashingDeepLearning - Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"
Stockfish - A free and strong UCI chess engine