-
hlb-gpt
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to larger models with one parameter change (feature currently in alpha).
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
It's release day again and today we're releasing a new repository: hlb-gpt. It's based on nanoGPT, but smaller with an aggressively-trimmed feature set. In this initial release, the training performs almost exactly the same as Andrej's library, but a tiny bit faster and a tiny bit more accurate due to using PyTorch-native operators. We keep the complexity down by targeting tiny, rapid experiments on a single GPU only. The baseline network we're releasing gets <3.8 validation loss in just over 6 minutes. Having a rapidly training network offers a variety of benefits -- this is something that helped a lot when working on hlb-cifar10. Cycle times are king in research, and we rarely need giant models to get enough of a loss signal when prototyping/experimenting with a method.
You can find the code for hlb-gpt here: https://github.com/tysam-code/hlb-gpt