NanoGPT

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

nanoGPT

69 31,914 5.4 Python

The simplest, fastest repository for training/finetuning medium-sized GPTs.

An interesting outcome of the nanoGPT repo is this struggle to exactly match the Chinchilla findings[0], even after discussing it with the authors.
A larger discussion is that the scaling laws achieve loss-optimal compute time, but the pre-training loss only improves predictions on the corpus, which contains texts written by people that were wrong or whose prose was lacking. In a real system, what you want to optimize for is accuracy, composability, inventiveness.
[0]: https://github.com/karpathy/nanoGPT/blob/master/scaling_laws...

aitextgen

19 1,826 1.8 Python

A robust Python tool for text-based AI training and generation using GPT-2.

To train small gpt-like models, there's also aitextgen: https://github.com/minimaxir/aitextgen

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
cramming

6 1,234 7.3 Python

Cramming the training of a (BERT-type) language model into limited compute.
askai

1,748 86 10.0 TypeScript

Command Line Interface for OpenAi ChatGPT (by yudax42)

A100’s are Nvidia GPU’s. You can rent them from providers like AWS or LamdaLabs. The readme has instructions for downloading the original GPT2 weights from OpenAI. You can also train a very simple version on a smaller dataset from your laptop as described in the README.
If you just want to play with a similar but much better model goto https://chat.openai.com

whisper.cpp

187 31,174 9.8 C

Port of OpenAI's Whisper model in C/C++

While doing my PhD some years ago (it wasn't a PhD on AI, but a very related thing) I trained several models with the usual stack back then (pytorch and TF). I realized that a lot of this stack could be rewritten in much simpler terms without sacrificing much fidelity and/or performance.
Submissions like yours and other projects like this one -> https://github.com/ggerganov/whisper.cpp
makes it pretty clear to me (and others) that this intuition is correct.
There's a couple tools I created back then that could push things further towards this direction, unfortunately they're not mature enough to warrant a release but the ideas they portray are worth taking a look at (IMHO). If there's interest on your side (or anyone reading this thread) I'd love to talk more about it.

hivemind

40 1,837 5.4 Python

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

There absolutely are! Check out hivemind (https://github.com/learning-at-home/hivemind), a general library for deep learning over the Internet, or Petals (https://petals.ml/), a system that leverages Hivemind and allows you to run BLOOM-176B (or other large language models) that is distributed over many volunteer PCs. You can join it and host some layers of the model by running literally one command on a Linux machine with Docker and a recent enough GPU.
Disclaimer: I work on these projects, both are based on our research over the past three years

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Using Bitcoin and Blockchain ideas to Secure our AI Chatbot

1 project | dev.to | 19 Apr 2024
Rolling your own CAPTCHA solution

1 project | dev.to | 18 Apr 2024
Succeeding where NASDAQ fails

1 project | dev.to | 17 Apr 2024
Prompt Engineering Guide

1 project | news.ycombinator.com | 30 Mar 2024
Silicon Valley is a Pump and Dump Scheme

1 project | dev.to | 5 Mar 2024

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
openai Machine Learning Deep Learning chatgpt english-language
Post date: 11 Jan 2023

nanoGPT

aitextgen

InfluxDB

cramming

askai

whisper.cpp

hivemind

Related posts

Using Bitcoin and Blockchain ideas to Secure our AI Chatbot

Rolling your own CAPTCHA solution

Succeeding where NASDAQ fails

Prompt Engineering Guide

Silicon Valley is a Pump and Dump Scheme