Yandex opensources 100B parameter GPT-like model

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

YaLM-100B

35 3,723 0.0 Python

Pretrained language model with 100B parameters

It appears the indexing for the model parts is deliberately not contiguous; the 03-82 range represents the main 80 transformer layers. https://github.com/yandex/YaLM-100B/blob/main/megatron_lm/me...
SLIDE

3 477 0.0

That's pretty much what SLIDE [0] does. The driver was achieving performance parity with GPUs for CPU training, but presumably the same could apply to running inference on models too large to load into consumer GPU memory.
https://github.com/RUSH-LAB/SLIDE
InfluxDB

www.influxdata.com
sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
mesh-transformer-jax

52 6,213 0.0 Python

Model parallel transformers in JAX and Haiku

doesnt seem the code is there - pretrained models are there. https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b
https://huggingface.co/EleutherAI/gpt-j-6B
isnt that so ?
gpt-neox

52 6,556 9.0 Python

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
YaLM-100B

1 0 0.0 Python

Pretrained language model with 100B parameters (by lostmsu)

I downloaded the weights and made a .torrent file (also a magnet link, see raw README.md). Can somebody else who downloaded the files as well doublecheck the checksums?
https://github.com/lostmsu/YaLM-100B/tree/Torrent
WorkOS

workos.com
sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project