SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Machine Learning Open-Source Projects
-
Project mention: The $100 ChatGPT: Why Karpathy's nanochat Represnts the Next Big Thing | dev.to | 2026-05-04
TensorFlow: 2.1 million lines
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Project mention: The $100 ChatGPT: Why Karpathy's nanochat Represnts the Next Big Thing | dev.to | 2026-05-04Hugging Face Transformers: 500,000+ lines
-
Project mention: Tracing torch.cuda.empty_cache() on an RTX 4090 - Where Do the 53 MB Go? | dev.to | 2026-05-28
pytorch/pytorch#173382 - a user calls torch.cuda.empty_cache() after deleting tensors, but GPU memory stays allocated. The caching allocator's empty_cache() only releases blocks it has marked as free, but the user sees a persistent gap between "allocated" and "reserved" memory. We traced what happens when torch cuda empty cache runs on an RTX 4090 and measured exactly how much GPU memory it reclaims.
-
-
A beginner-friendly ML curriculum with practical examples and exercises you can actually finish. A solid starting point if you’re new to ML and want quick wins. Link: https://github.com/microsoft/ML-For-Beginners
-
-
A hands-on, end-to-end course on building, evaluating, and deploying LLM applications. Ideal when you want a clear path from spark of an idea to deployment. Link: https://github.com/mlabonne/llm-course
-
> Like… has anyone done a Jepsen-like stress test on rsyslogd and shared the results? I’ve half-assedly looked before and not been able to find anything.
I've not used rsyslogd specifically, but I don't see how you'd have any issues with the log volume you described.
[1] https://github.com/netdata/netdata/tree/master/src/crates/ne...
[2] https://learn.netdata.cloud/docs/logs/systemd-journal-logs/s...
-
How does it compare to Tesseract? https://github.com/tesseract-ocr/tesseract
I use ocrmypdf (which uses Tesseract). Runs locally and is absolutely fantastic. https://ocrmypdf.readthedocs.io/en/latest/
-
A deeply-synthesized, opinionated reference distilled from five canonical sources: donnemartin/system-design-primer · ByteByteGoHq/system-design-101 · karanpratapsingh/system-design · ashishps1/awesome-system-design-resources · binhnguyennus/awesome-scalability
-
How would you rate OpenBB [0]? It’s touted as a Bloomberg Terminal alternative and it has most certainly been included in training for all SOTA models.
0. https://github.com/OpenBB-finance/OpenBB
-
nn
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
-
certutil.exe or notepad.exe opening an external connection lands in rare because, fleet-wide, those processes almost never egress. Tune the <= 3 threshold to your environment size. For a more principled version, score each (process, destination) pair by frequency and treat the long tail as the hunt queue, which is the same idea behind scikit-learn's rarity-based anomaly methods without the model overhead.
-
Keras 3 multi-backend
-
llm-app
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
-
Project mention: Transforming Unstructured Retail Catalogs into Structured Data using AI | dev.to | 2026-04-10
Before reading any text, we run the raw catalog pages through a custom object detection model (based on YOLO architecture). This model is trained to identify the bounding boxes of individual product regions, allowing us to crop the giant page into smaller, isolated product images.
-
Project mention: Teaching AI to Read Emotions: Science, Challenges, and Innovation Behind Facial Emotion Detection with YOLOv11 on Raspberry Pi | dev.to | 2025-11-23
Ultralytics YOLO Documentation
-
Project mention: Show HN: Real-time privacy protection for smart glasses | news.ycombinator.com | 2025-08-11
Did you look at egoblur? its a lot more effective at face detection than https://github.com/ageitgey/face_recognition granted, you'd have to do your own face matching to do exception.
-
-
-
If you're looking for a language that aims to solve the "two-language problem" like Mojo, but want something more open, more mature and less influenced by VC funding, check out Julia: https://julialang.org/
-
-
Machine Learning discussion
Machine Learning related posts
-
Show HN: PyTorch on Java
-
Playing with Vision Embeddings
-
The Smallest Brain You Can Build: A Perceptron in Python
-
TensorCircuit-NG vs cuQuantum on H200: JIT compilation beats the "magic GPU library" assumption
-
Why JAX Is a Much Better Backend for Quantum Circuit Simulation Than PyTorch
-
TensorCircuit-NG: How to Tell Whether a Quantum x AI x HPC Platform Is Truly Mature When Everyone Tells the Same Story
-
Pluto.jl 1.0 release – reactive notebook for Julia
-
A note from our sponsor - SaaSHub
www.saashub.com | 13 Jun 2026
Index
What are some of the best open-source Machine Learning projects? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | tensorflow | 195,632 |
| 2 | transformers | 161,558 |
| 3 | Pytorch | 100,659 |
| 4 | LLMs-from-scratch | 96,979 |
| 5 | ML-For-Beginners | 86,783 |
| 6 | cs-video-courses | 81,757 |
| 7 | llm-course | 80,074 |
| 8 | Netdata | 79,077 |
| 9 | tesseract-ocr | 74,650 |
| 10 | awesome-scalability | 71,672 |
| 11 | OpenBB | 69,044 |
| 12 | nn | 66,926 |
| 13 | scikit-learn | 66,289 |
| 14 | Keras | 64,086 |
| 15 | llm-app | 59,332 |
| 16 | ultralytics | 58,307 |
| 17 | yolov5 | 57,519 |
| 18 | Face Recognition | 56,402 |
| 19 | faceswap | 55,268 |
| 20 | 100-Days-Of-ML-Code | 50,067 |
| 21 | julia | 48,823 |
| 22 | Made-With-ML | 47,992 |
| 23 | AI-For-Beginners | 47,979 |