Do Simpler Machine Learning Models Exist and How Can We Find Them?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • hlb-CIFAR10

    Train CIFAR-10 in <7 seconds on an A100, the current world record.

    I recently released a codebase in beta that modernizes a tiny model that gets really good performance on CIFAR-10 in about 18.1 or so seconds on the right single GPU -- a number of years ago the world record was 10 minutes, down from several days a few years previously.

    While most of my work was porting and cleaning up certain parts of the code for a different purpose (just-clone-and-hack experimentation workbench), I've spent years optimizing neural networks at a very fine grained level, and many of the lessons learned here in debugging reflected that.

    There is fundamentally a few NP-hard layers unfortunately but they are not hard blockers to progress. The model I mentioned above is extremely simple and has little "extra fat" where it is not needed. It also importantly seems to have good gradient and such flow throughout, something that's important for a model to be able to learn quickly. There are a few reasonable priors, like initializing and freezing the first convolution to whiten the inputs based upon some statistics from the training data. That does a shocking amount of work in stabilizing and speeding up training.

    Ultimately, the network is simple, and there are a number of other methods to help it reach near-SOTA, but they are as simple as can be. I think as this project evolves and we get nearer to the goal (<2 seconds in a year or two), we'll keep uncovering good puzzle pieces showing exactly what it is that's allowing such a tiny network to perform so well. There's a kind of exponential value to having ultra-short training times -- you can somewhat open-endedly barrage-test your algorithm, something that's already led to a few interesting discoveries that I'd like to refine before publishing to the repo.

    If you're interested, the code is here. The running code is a single .py with the upsides and downsides that come with that. If you're interested or have any questions, let me know! :D :))))

    https://github.com/tysam-code/hlb-CIFAR10

  • symreg

    A Symbolic Regression engine

    If interpretability is sufficiently important, you could straight-up search for mathematical formulae.

    My SymReg library pops to mind. I'm thinking of rewriting it in multithreaded Julia this holiday season.

    https://github.com/danuker/symreg

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • SymbolicRegression.jl

    Distributed High-Performance Symbolic Regression in Julia

  • AI

    Artificial Intelligence Projects (by RowColz)

    > similar to genetic programming

    A genetical algorithm was also what I was thinking of. Come up with some kind of symbolic (textual) way to represent a wiring/circuit diagram (graph) and evolve the most efficient "learner" using mutation and cross-breeding (e-sex). The earliest GA I read about used Lisp dicing.

    As far as "easiest" AI for humans to work with, "Factor tables" may be a way:

    https://github.com/RowColz/AI

    AI tuning them becomes more like accounting instead of a lab with Doc Brown.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts