Show HN: 78% MNIST accuracy using GZIP in under 10 lines of code

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

umap_paper_notebooks

1 41 10.0 Jupyter Notebook

Notebooks in support of the UMAP paper

In fairness you can run MNIST through UMAP and get near perfect seperation. I'm of the belief that you have to try pretty hard not to do well on MNIST these days.
https://github.com/lmcinnes/umap_paper_notebooks/blob/master...

hlb-CIFAR10

36 1,187 3.5 Python

Train CIFAR-10 in <7 seconds on an A100, the current world record.

If you'd like to play around with MNIST yourself, I wrote a PyTorch training implementation that gets ~95.45%+ in <13.6 seconds on a V100, est. < 6.5 seconds on an A100. Made to be edited/run in Colab: https://github.com/tysam-code/hlb-CIFAR10
It's originally kitted for CIFAR10, but I've found the parameters to be quite general. The code is very easy to read and well-commented, and is a great starting place for exploration.
Min-cut deltas to run MNIST:
`.datasets.CIFAR10('` -> `.datasets.MNIST('` (both occurences)

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
mono

1 5 6.7 Python

monorepo for personal projects, experiments, .. (by Jakob-98)
mnist_1_pt_2

1 16 5.3 Python

1.2% test error on MNIST using only least squares and numpy calls.

ben recht's kernel method implementation in 10 lines hits 98%
https://github.com/benjamin-recht/mnist_1_pt_2/tree/main

label-errors

7 176 0.0

🛠️ Corrected Test Sets for ImageNet, MNIST, CIFAR, Caltech-256, QuickDraw, IMDB, Amazon Reviews, 20News, and AudioSet

Sadly,there are several errors in the labeled data, so no one should get 100%.
See https://labelerrors.com/

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Train to 94% on CIFAR-10 in 3.29 seconds on a single A100
2 projects | news.ycombinator.com | 4 Apr 2024
Deep Dive into the Vision Transformers Paper (ViT)
3 projects | news.ycombinator.com | 1 Dec 2023
The Mathematics of Training LLMs
3 projects | news.ycombinator.com | 16 Aug 2023
There is no hard takeoff
2 projects | news.ycombinator.com | 11 Aug 2023
Stanford Cars (cars196) contains many Fine-Grained Errors
1 project | /r/datasets | 24 May 2023

Show HN: 78% MNIST accuracy using GZIP in under 10 lines of code

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Machine Learning Benchmarking Deep Learning Datasets world-record
Post date: 20 Sep 2023

umap_paper_notebooks

hlb-CIFAR10

InfluxDB

mono

mnist_1_pt_2

label-errors

WorkOS

Related posts

Show HN: 78% MNIST accuracy using GZIP in under 10 lines of code

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Machine Learning Benchmarking Deep Learning Datasets world-record Post date: 20 Sep 2023

umap_paper_notebooks

hlb-CIFAR10

InfluxDB

mono

mnist_1_pt_2

label-errors

WorkOS

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Machine Learning Benchmarking Deep Learning Datasets world-record
Post date: 20 Sep 2023