Python Machine Learning

Open-source Python projects categorized as Machine Learning

Top 23 Python Machine Learning Projects

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: Gemma doesn't suck anymore – 8 bug fixes | news.ycombinator.com | 2024-03-11

    Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)

    The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285

    My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402

  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Project mention: Best AI Tools for Students Learning Development and Engineering | dev.to | 2024-03-18

    Which label applies to a tool sometimes depends on what you do with it. For example, PyTorch or TensorFlow can be called a library, a toolkit, or a machine-learning framework.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • Keras

    Deep Learning for humans

    Project mention: Keras 3.0 | news.ycombinator.com | 2023-11-28

    All breaking changes are listed here: https://github.com/keras-team/keras/issues/18467

    You can use this migration guide to identify and fix each of these issues (and further, making your code run on JAX or PyTorch): https://keras.io/guides/migrating_to_keras_3/

  • scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Polars | news.ycombinator.com | 2024-01-08

    sklearn is adding support through the dataframe interchange protocol (https://github.com/scikit-learn/scikit-learn/issues/25896). scipy, as far as I know, doesn't explicitly support dataframes (it just happens to work when you wrap a Series in `np.array` or `np.asarray`). I don't know about PyTorch but in general you can convert to numpy.

  • Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: Security Image Recognition | /r/computervision | 2023-12-10

    Camera connected to a PI? Something like this could run locally: https://github.com/ageitgey/face_recognition

  • faceswap

    Deepfakes Software For All

    Project mention: faceswap VS facefusion - a user suggested alternative | libhunt.com/r/faceswap | 2024-01-30
  • yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

    Project mention: How would i go about having YOLO v5 return me a list from left to right of all detected objects in an image? | /r/computervision | 2023-11-13
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

    Project mention: Best open source AI chatbot alternative? | /r/opensource | 2023-12-08

    For open assistant, the code: https://github.com/LAION-AI/Open-Assistant/tree/main/inference

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Project mention: Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions | dev.to | 2024-02-12

    Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.

  • gym

    A toolkit for developing and comparing reinforcement learning algorithms.

    Project mention: OpenAI Acquires Global Illumination | news.ycombinator.com | 2023-08-16

    A co-founder announced they disbanded their robots team a couple years ago: https://venturebeat.com/business/openai-disbands-its-robotic...

    That was the same time they depreciated OpenAI Gym: https://github.com/openai/gym

  • DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Project mention: Can we discuss MLOps, Deployment, Optimizations, and Speed? | /r/LocalLLaMA | 2023-12-06

    DeepSpeed can handle parallelism concerns, and even offload data/model to RAM, or even NVMe (!?) . I'm surprised I don't see this project used more.

  • streamlit

    Streamlit — A faster way to build and share data apps.

    Project mention: Show HN: Buefy Web Components for Streamlit | news.ycombinator.com | 2024-03-04

    While building dashboards in Streamlit, I found myself really missing Buefy's (Bulma) modern web components.

    Specially due to the inability to add new values to Streamlit's multiselect [1], some missing controls like a polished image carousel [2] or a highly customizable data table.

    Long story short, we put together streamfy (Streamlit + Buefy) as an MIT licensed project in GitHub to bring Buefy to Streamlit.

    Demo: https://streamfy.streamlit.app

    All the form components are implemented, missing half of other non-form UX components. There is plenty of room for PRs, testing, feedback, documentation, example, etc.

    Please send issues and contributions to GitHub project [3] and general feedback to X / Twitter [4]

    Thanks!

    [1] https://github.com/streamlit/streamlit/issues/5348

  • Ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

    22. Ray | Github | tutorial

  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: Best AI SEO Tools for NLP Content Optimization | /r/aitoolsnews | 2023-12-09

    SpaCy: An open-source library providing tools for advanced NLP tasks like tokenization, entity recognition, and part-of-speech tagging.

  • gradio

    Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

    Project mention: Show HN: Dropbase – Build internal web apps with just Python | news.ycombinator.com | 2023-12-05

    There's also that library all the AI models started using that gives you a public URL to share. After researching it: https://www.gradio.app/ is the link.

    It's used specifically for making simple UIs for machine learning apps. But I guess technically you could use it for anything.

  • pytorch-lightning

    Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

    Project mention: Lightning AI Studios – A persistent GPU cloud environment | news.ycombinator.com | 2023-12-14
  • data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • OpenBBTerminal

    Investment Research for Everyone, Everywhere.

    Project mention: Open-Sourcing High-Frequency Trading and Market-Making Backtesting Tool | /r/Python | 2023-12-06

    You might want to suggest this as an extension to the OpenBB project - I imagine that could be of interest to them if there isn’t something like it built in already :-)

  • ML-From-Scratch

    Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

  • NLP-progress

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

  • EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

    Project mention: Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide | dev.to | 2023-12-27

    PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]

  • d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

    Project mention: which book to chose for deep learning :lan Goodfellow or francois chollet | /r/learnmachinelearning | 2023-04-07
  • ultralytics

    NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

    Project mention: The CEO of Ultralytics (yolov8) using LLMs to engage with commenters on GitHub | news.ycombinator.com | 2024-02-12

    Yep, I noticed this a while ago. It posts easily identifiable ChatGPT responses. It also posts garbage wrong answers which makes it worse than useless. Totally disrespectful to the userbase.

    https://github.com/ultralytics/ultralytics/issues/5748#issue...

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-18.

Python Machine Learning related posts

Index

What are some of the best open-source Machine Learning projects in Python? This list will help you:

Project Stars
1 transformers 122,103
2 Pytorch 76,684
3 Keras 60,643
4 scikit-learn 57,674
5 Face Recognition 51,332
6 faceswap 48,827
7 yolov5 45,808
8 Open-Assistant 36,472
9 Airflow 33,864
10 gym 33,676
11 DeepSpeed 31,898
12 streamlit 30,808
13 Ray 30,364
14 spaCy 28,455
15 gradio 27,486
16 pytorch-lightning 26,457
17 data-science-ipython-notebooks 26,278
18 OpenBBTerminal 25,785
19 ML-From-Scratch 23,004
20 NLP-progress 22,238
21 EasyOCR 21,448
22 d2l-en 21,232
23 ultralytics 20,652
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com