NanoGPT

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

  • An interesting outcome of the nanoGPT repo is this struggle to exactly match the Chinchilla findings[0], even after discussing it with the authors.

    A larger discussion is that the scaling laws achieve loss-optimal compute time, but the pre-training loss only improves predictions on the corpus, which contains texts written by people that were wrong or whose prose was lacking. In a real system, what you want to optimize for is accuracy, composability, inventiveness.

    [0]: https://github.com/karpathy/nanoGPT/blob/master/scaling_laws...

  • aitextgen

    A robust Python tool for text-based AI training and generation using GPT-2.

  • To train small gpt-like models, there's also aitextgen: https://github.com/minimaxir/aitextgen

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • cramming

    Cramming the training of a (BERT-type) language model into limited compute.

  • askai

    Command Line Interface for OpenAi ChatGPT (by yudax42)

  • A100’s are Nvidia GPU’s. You can rent them from providers like AWS or LamdaLabs. The readme has instructions for downloading the original GPT2 weights from OpenAI. You can also train a very simple version on a smaller dataset from your laptop as described in the README.

    If you just want to play with a similar but much better model goto https://chat.openai.com

  • whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  • While doing my PhD some years ago (it wasn't a PhD on AI, but a very related thing) I trained several models with the usual stack back then (pytorch and TF). I realized that a lot of this stack could be rewritten in much simpler terms without sacrificing much fidelity and/or performance.

    Submissions like yours and other projects like this one -> https://github.com/ggerganov/whisper.cpp

    makes it pretty clear to me (and others) that this intuition is correct.

    There's a couple tools I created back then that could push things further towards this direction, unfortunately they're not mature enough to warrant a release but the ideas they portray are worth taking a look at (IMHO). If there's interest on your side (or anyone reading this thread) I'd love to talk more about it.

  • hivemind

    Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

  • There absolutely are! Check out hivemind (https://github.com/learning-at-home/hivemind), a general library for deep learning over the Internet, or Petals (https://petals.ml/), a system that leverages Hivemind and allows you to run BLOOM-176B (or other large language models) that is distributed over many volunteer PCs. You can join it and host some layers of the model by running literally one command on a Linux machine with Docker and a recent enough GPU.

    Disclaimer: I work on these projects, both are based on our research over the past three years

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Using Bitcoin and Blockchain ideas to Secure our AI Chatbot

    1 project | dev.to | 19 Apr 2024
  • Rolling your own CAPTCHA solution

    1 project | dev.to | 18 Apr 2024
  • Succeeding where NASDAQ fails

    1 project | dev.to | 17 Apr 2024
  • Prompt Engineering Guide

    1 project | news.ycombinator.com | 30 Mar 2024
  • Silicon Valley is a Pump and Dump Scheme

    1 project | dev.to | 5 Mar 2024