Python Machine Learning

Open-source Python projects categorized as Machine Learning

Top 23 Python Machine Learning Projects

Machine Learning
  1. transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: Llama 4 Smells Bad | news.ycombinator.com | 2025-04-24

    There were actually multiple bugs which impacted long context benchmarks and general inference - I helped fix some of them.

    1. RMS norm was 1e-6, but should be 1e-5 - see https://github.com/huggingface/transformers/pull/37418

    2. Llama 4 Scout changed RoPE settings after release - conversion script for llama.cpp had to be fixed. See https://github.com/ggml-org/llama.cpp/pull/12889

    3. vLLM and the Llama 4 team found QK Norm was normalizing across entire Q & K which was wrong - accuracy increased by 2%. See https://github.com/vllm-project/vllm/pull/16311

    If you see https://x.com/WolframRvnwlf/status/1909735579564331016 - the GGUFs I uploaded for Scout actually did better than inference providers by +~5% on MMLU Pro. https://docs.unsloth.ai/basics/tutorial-how-to-run-and-fine-... has more details

  2. Judoscale

    Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.

    Judoscale logo
  3. Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Project mention: How to Get Started with Scikit-Learn: A Beginner-Friendly Guide to Machine Learning in Python | dev.to | 2025-04-24

    PyTorch

  4. Keras

    Deep Learning for humans

    Project mention: A Man Out to Prove How Dumb AI Still Is | news.ycombinator.com | 2025-04-04

    >Chollet, a French computer scientist and one of the industry’s sharpest skeptics

    I feel like this description really buries the lede on Chollet's expertise. (For those who don't know, he's the creator of and lead contributor[0] to Keras)

    [0]https://github.com/keras-team/keras/graphs/contributors

  5. scikit-learn

    scikit-learn: machine learning in Python

    Project mention: 10 Useful Tools and Libraries for Python Developers | dev.to | 2025-03-29

    7. Scikit-learn - Machine Learning

  6. nn

    🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

  7. Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: They See Your Photos | news.ycombinator.com | 2024-12-14

    Syncthing, python face_recognition [1], a static gallery (sigal [2]), and a few lines of bash and its fully automatic. I can even share links with family.

    [1] https://github.com/ageitgey/face_recognition

    [2] https://github.com/saimn/sigal

  8. faceswap

    Deepfakes Software For All

  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

    Project mention: Using YOLO for Real-Time Object Detection with Koyeb GPUs | dev.to | 2024-07-31

    There are several implementations of the YOLO algorithm available, but for ease-of-use, we will use the Ultralytics implementation in this guide. We will implement and test the code locally and then deploy to Koyeb's GPUs for higher inference speed.

  11. OpenBB

    Investment Research for Everyone, Everywhere.

    Project mention: OpenBB – Investment Research for Everyone, Everywhere | news.ycombinator.com | 2025-03-22
  12. ultralytics

    Ultralytics YOLO11 🚀

    Project mention: Show HN: Using YOLO to Detect Office Chairs in 40M Hotel Photos | news.ycombinator.com | 2025-01-25

    They did it on their own computer. https://github.com/ultralytics/ultralytics

  13. Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Project mention: Airflow AI SDK to build simple LLM workflows | news.ycombinator.com | 2025-03-26

    Hi HN,

    We've built an SDK for building DAGs / data pipelines with LLMs in Apache Airflow [1] using Pydantic AI [2] under the hood. I've seen success across the board with Airflow users building simple LLM workflows before moving on to "AI agents". In my experience, the noise around building agents means that people forget that there are other ways to get more immediate value out of LLMs.

    Coupling Airflow for orchestration and Pydantic AI for LLM interactions has turned out to be a very pragmatic approach to building these workflows (and agents). Neither tool "gets in the way" of what you're trying to do. Airflow's been around for 10+ years and has a very well-built orchestration engine rich with everything you need to write production grade data pipelines, and Pydantic AI's been a refreshing take on working with LLMs.

    Would love some feedback from this community!

    [1] https://github.com/apache/airflow

  14. streamlit

    Streamlit — A faster way to build and share data apps.

    Project mention: How AI is Transforming Front-End Development in 2025! | dev.to | 2025-04-23

    Streamlit.io: Great documentation and reusable components to integrate with your AI application for rapid python front-end AI development

  15. DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Project mention: DeepSpeed-Domino: Communication-Free LLM Training Engine | news.ycombinator.com | 2024-11-26
  16. gradio

    Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

    Project mention: How I Used Amazon Nova Reel and Gradio to Auto-Generate Stunning GIF Banners | dev.to | 2025-04-17

    To make the tool easy to use, I built a UI with Gradio:

  17. Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

  18. Ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: Ask HN: What Open Source Projects Need Help? | news.ycombinator.com | 2024-11-16

    I'm guessing this comment is some kind of "if you know, you know." Likely starting from https://docs.ray.io/en/latest/cluster/vms/user-guides/launch... and then trawling through one of these I guess https://github.com/ray-project/ray/issues?q=is%3Aissue+prem+...

  19. gym

    A toolkit for developing and comparing reinforcement learning algorithms.

    Project mention: Something weird is happening with LLMs and chess | news.ycombinator.com | 2024-11-14

    > OpenAI has never done anything except conversational agents.

    Tell me you haven't been following this field without telling me you haven't been following this field[0][1][2]?

    [0]: https://github.com/openai/gym

  20. spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: 15,000 lines of verified cryptography now in Python | news.ycombinator.com | 2025-04-18

    Geez honestly

    This seems to be the issue https://github.com/explosion/spaCy/issues/13658#issuecomment...

    And you depend on opinionated libraries that break with newer versions. Why? Well because f you that's why! Because our library is not just a tool, it's a lifestyle

    Though it seems that Pydantic 1x does support 3.13 https://docs.pydantic.dev/1.10/changelog/#v11020-2025-01-07

  21. pytorch-lightning

    Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.

    Project mention: SB-1047 will stifle open-source AI and decrease safety | news.ycombinator.com | 2024-04-29

    It's very easy to get started, right in your Terminal, no fees! No credit card at all.

    And there are cloud providers like https://replicate.com/ and https://lightning.ai/ that will let you use your LLM via an API key just like you did with OpenAI if you need that.

    You don't need OpenAI - nobody does.

  22. data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  23. MindsDB

    AI's query engine - Platform for building AI that can learn and answer questions over large scale federated data.

    Project mention: Unlocking the Power of Data with MindsDB's Federated Query Engine | dev.to | 2025-04-10

    Access open source MindsDB’s Federated Query Engine on GitHub here.

  24. paperless-ngx

    A community-supported supercharged version of paperless: scan, index and archive all your physical documents

    Project mention: Paperless-ngx: scan, index and archive all your physical documents | news.ycombinator.com | 2024-09-30
  25. supervision

    We write your reusable computer vision tools. 💜

    Project mention: Ask HN: Who is hiring? (April 2025) | news.ycombinator.com | 2025-04-01
  26. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Machine Learning discussion

Log in or Post with

Python Machine Learning related posts

  • How to Get Started with Scikit-Learn: A Beginner-Friendly Guide to Machine Learning in Python

    7 projects | dev.to | 24 Apr 2025
  • Create a Smart Java Chatbot Using Python’s ChatterBot – No APIs Needed

    1 project | dev.to | 24 Apr 2025
  • How I Used Amazon Nova Reel and Gradio to Auto-Generate Stunning GIF Banners

    2 projects | dev.to | 17 Apr 2025
  • Docker Model Runner

    2 projects | news.ycombinator.com | 14 Apr 2025
  • A beginner's guide to the Grounding-Dino model by Adirik on Replicate

    1 project | dev.to | 12 Apr 2025
  • This Bench Does Not Exist

    1 project | news.ycombinator.com | 11 Apr 2025
  • Show HN: Open-source, cross platform document data extraction with no OCR

    1 project | news.ycombinator.com | 11 Apr 2025
  • A note from our sponsor - Judoscale
    judoscale.com | 24 Apr 2025
    Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more →

Index

What are some of the best open-source Machine Learning projects in Python? This list will help you:

# Project Stars
1 transformers 143,133
2 Pytorch 89,253
3 Keras 62,884
4 scikit-learn 61,793
5 nn 60,225
6 Face Recognition 54,636
7 faceswap 53,719
8 yolov5 53,449
9 OpenBB 40,929
10 ultralytics 39,737
11 Airflow 39,656
12 streamlit 38,898
13 DeepSpeed 38,004
14 gradio 37,625
15 Open-Assistant 37,309
16 Ray 36,619
17 gym 35,851
18 spaCy 31,423
19 pytorch-lightning 29,356
20 data-science-ipython-notebooks 27,993
21 MindsDB 27,762
22 paperless-ngx 26,717
23 supervision 26,491

Sponsored
Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?