Python Machine Learning

Open-source Python projects categorized as Machine Learning

Top 23 Python Machine Learning Projects

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: HuggingFace Transformers: Qwen2 | | 2024-01-11
  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Project mention: PyTorch 2.2: FlashAttention-v2, AOTInductor | | 2024-01-30
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • Keras

    Deep Learning for humans

    Project mention: Keras 3.0 | | 2023-11-28

    All breaking changes are listed here:

    You can use this migration guide to identify and fix each of these issues (and further, making your code run on JAX or PyTorch):

  • scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Polars | | 2024-01-08

    sklearn is adding support through the dataframe interchange protocol ( scipy, as far as I know, doesn't explicitly support dataframes (it just happens to work when you wrap a Series in `np.array` or `np.asarray`). I don't know about PyTorch but in general you can convert to numpy.

  • Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: Security Image Recognition | /r/computervision | 2023-12-10

    Camera connected to a PI? Something like this could run locally:

  • faceswap

    Deepfakes Software For All

    Project mention: faceswap VS facefusion - a user suggested alternative | | 2024-01-30
  • yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

    Project mention: How would i go about having YOLO v5 return me a list from left to right of all detected objects in an image? | /r/computervision | 2023-11-13

    Learn 300+ open source libraries for free using AI. LearnThisRepo lets you learn 300+ open source repos including Postgres, Langchain, VS Code, and more by chatting with them using AI!

  • Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

    Project mention: Best open source AI chatbot alternative? | /r/opensource | 2023-12-08

    For open assistant, the code:

  • gym

    A toolkit for developing and comparing reinforcement learning algorithms.

    Project mention: OpenAI Acquires Global Illumination | | 2023-08-16

    A co-founder announced they disbanded their robots team a couple years ago:

    That was the same time they depreciated OpenAI Gym:

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Project mention: Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions | | 2024-02-12

    Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.

  • DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Project mention: Can we discuss MLOps, Deployment, Optimizations, and Speed? | /r/LocalLLaMA | 2023-12-06

    DeepSpeed can handle parallelism concerns, and even offload data/model to RAM, or even NVMe (!?) . I'm surprised I don't see this project used more.

  • streamlit

    Streamlit — A faster way to build and share data apps.

    Project mention: Show HN: Hyperdiv – Reactive, immediate-mode web UI framework for Python | | 2024-02-20

    Looks cool. How do you see this differing from streamlit?

  • Ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: Open Source Advent Fun Wraps Up! | | 2024-01-05

    22. Ray | Github | tutorial

  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: Best AI SEO Tools for NLP Content Optimization | /r/aitoolsnews | 2023-12-09

    SpaCy: An open-source library providing tools for advanced NLP tasks like tokenization, entity recognition, and part-of-speech tagging.

  • gradio

    Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

    Project mention: Show HN: Dropbase – Build internal web apps with just Python | | 2023-12-05

    There's also that library all the AI models started using that gives you a public URL to share. After researching it: is the link.

    It's used specifically for making simple UIs for machine learning apps. But I guess technically you could use it for anything.

  • pytorch-lightning

    Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

    Project mention: Lightning AI Studios – A persistent GPU cloud environment | | 2023-12-14
  • data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • OpenBBTerminal

    Investment Research for Everyone, Everywhere.

    Project mention: Open-Sourcing High-Frequency Trading and Market-Making Backtesting Tool | /r/Python | 2023-12-06

    You might want to suggest this as an extension to the OpenBB project - I imagine that could be of interest to them if there isn’t something like it built in already :-)

  • ML-From-Scratch

    Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

  • NLP-progress

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

  • EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

    Project mention: Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide | | 2023-12-27

    PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]

  • d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

    Project mention: which book to chose for deep learning :lan Goodfellow or francois chollet | /r/learnmachinelearning | 2023-04-07
  • MindsDB

    The middleware for building custom AI, enabling smarter organizations.

    Project mention: Fine-tuning a Mistral Language Model with Anyscale | | 2024-02-01

    MindsDB is an open-source AI platform for developers that connects AI/ML models with real-time data. It provides tools and automation to easily build and maintain personalized AI solutions.

  • WorkOS

    The modern API for authentication & user identity. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-20.

Python Machine Learning related posts


What are some of the best open-source Machine Learning projects in Python? This list will help you:

Project Stars
1 transformers 120,169
2 Pytorch 75,519
3 Keras 60,309
4 scikit-learn 57,413
5 Face Recognition 51,132
6 faceswap 48,520
7 yolov5 45,090
8 Open-Assistant 36,309
9 gym 33,548
10 Airflow 33,510
11 DeepSpeed 31,395
12 streamlit 30,216
13 Ray 29,919
14 spaCy 28,280
15 gradio 26,422
16 pytorch-lightning 26,262
17 data-science-ipython-notebooks 26,140
18 OpenBBTerminal 25,590
19 ML-From-Scratch 22,922
20 NLP-progress 22,167
21 EasyOCR 21,106
22 d2l-en 20,882
23 MindsDB 19,969
The modern API for authentication & user identity.
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.