Cerebras Open Sources Seven GPT models and Introduces New Scaling Law

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

modelzoo

2 849 5.4 Python

We believe in fostering open access to the best models, datasets, and hardware. So we have made the model, training recipe, weights, and checkpoints available on Hugging Face and GitHub under the permissive Apache 2.0 license. Our paper, which will be available soon, will detail our training methods and performance results. Please see figure 1 for a summary of how the Cerebras-GPT family compares to industry-leading models.

mup

12 1,169 3.8 Jupyter Notebook

maximal update parametrization (µP)

This is the first time I have seen muP applied by the third party. See Cerebras Model Zoo, where muP models have scale-invariant constant LR.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Bard is getting better at logic and reasoning

1 project | news.ycombinator.com | 7 Jun 2023
OpenAI’s policies hinder reproducible research on language models

2 projects | news.ycombinator.com | 23 Mar 2023
[R] Greg Yang's work on a rigorous mathematical theory for neural networks

4 projects | /r/MachineLearning | 7 Jan 2023
DeepMind’s New Language Model,Chinchilla(70B Parameters),Which Outperforms GPT-3

3 projects | news.ycombinator.com | 11 Apr 2022
"Training Compute-Optimal Large Language Models", Hoffmann et al 2022 {DeepMind} (current LLMs are significantly undertrained)

1 project | /r/mlscaling | 31 Mar 2022

Cerebras Open Sources Seven GPT models and Introduces New Scaling Law

This page summarizes the projects mentioned and recommended in the original post on /r/mlscaling
Python Pytorch Transformers Machine Learning Deep Learning
Post date: 28 Mar 2023

modelzoo

mup

InfluxDB

Related posts

Bard is getting better at logic and reasoning

OpenAI’s policies hinder reproducible research on language models

[R] Greg Yang's work on a rigorous mathematical theory for neural networks

DeepMind’s New Language Model,Chinchilla(70B Parameters),Which Outperforms GPT-3

"Training Compute-Optimal Large Language Models", Hoffmann et al 2022 {DeepMind} (current LLMs are significantly undertrained)

Cerebras Open Sources Seven GPT models and Introduces New Scaling Law

This page summarizes the projects mentioned and recommended in the original post on /r/mlscaling Python Pytorch Transformers Machine Learning Deep Learning Post date: 28 Mar 2023

modelzoo

mup

InfluxDB

Related posts

Bard is getting better at logic and reasoning

OpenAI’s policies hinder reproducible research on language models

[R] Greg Yang's work on a rigorous mathematical theory for neural networks

DeepMind’s New Language Model,Chinchilla(70B Parameters),Which Outperforms GPT-3

"Training Compute-Optimal Large Language Models", Hoffmann et al 2022 {DeepMind} (current LLMs are significantly undertrained)

This page summarizes the projects mentioned and recommended in the original post on /r/mlscaling
Python Pytorch Transformers Machine Learning Deep Learning
Post date: 28 Mar 2023