Cerebras Open Sources Seven GPT models and Introduces New Scaling Law

This page summarizes the projects mentioned and recommended in the original post on /r/mlscaling

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • modelzoo

  • We believe in fostering open access to the best models, datasets, and hardware. So we have made the model, training recipe, weights, and checkpoints available on Hugging Face and GitHub under the permissive Apache 2.0 license. Our paper, which will be available soon, will detail our training methods and performance results. Please see figure 1 for a summary of how the Cerebras-GPT family compares to industry-leading models.

  • mup

    maximal update parametrization (µP)

  • This is the first time I have seen muP applied by the third party. See Cerebras Model Zoo, where muP models have scale-invariant constant LR.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Bard is getting better at logic and reasoning

    1 project | news.ycombinator.com | 7 Jun 2023
  • OpenAI’s policies hinder reproducible research on language models

    2 projects | news.ycombinator.com | 23 Mar 2023
  • [R] Greg Yang's work on a rigorous mathematical theory for neural networks

    4 projects | /r/MachineLearning | 7 Jan 2023
  • DeepMind’s New Language Model,Chinchilla(70B Parameters),Which Outperforms GPT-3

    3 projects | news.ycombinator.com | 11 Apr 2022
  • "Training Compute-Optimal Large Language Models", Hoffmann et al 2022 {DeepMind} (current LLMs are significantly undertrained)

    1 project | /r/mlscaling | 31 Mar 2022