Meta announces a GPT3-size language model you can download

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • metaseq

    Repo for external large-scale work

  • society, and academia; and those in industry research laboratories."

    The repository will be open "First thing in AM" (https://twitter.com/stephenroller/status/1521302841276645376):

    https://github.com/facebookresearch/metaseq/

  • gpt-2

    Code for the paper "Language Models are Unsupervised Multitask Learners"

  • Remember when OpenAi wrote this?

    > Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT-2 along with sampling code. We are not releasing the dataset, training code, or GPT-2 model weights

    Well I guess that didn’t last long.

    https://openai.com/blog/better-language-models/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • mesh-transformer-jax

    Model parallel transformers in JAX and Haiku

  • 175B * 16 bits = 350GB, but it does compress a bit.

    GPT-J-6B, which you can download at https://github.com/kingoflolz/mesh-transformer-jax, is 6B parameters but weighs 9GB. It does decompress to 12GB as expected. Assuming the same compression ratio, download size would be 263GB, not 350GB.

  • gpt-2-output-dataset

    Dataset of GPT-2 outputs for research in detection, biases, and more

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts