Python Machine Learning

Open-source Python projects categorized as Machine Learning | Edit details

Top 23 Python Machine Learning Projects

  • GitHub repo Keras

    Deep Learning for humans

    Project mention: 5 ways to keep your skills fresh after finishing a coding bootcamp | dev.to | 2021-11-28

    One way to improve your projects and coding skills is to try new models and libraries. For example, if you did classification with logistic regression, try also with random forest; if you used Tensorflow, now try Keras; if you scraped a website with BeautifulSoup, now do it with Scrapy. You get the point.

  • GitHub repo scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Data Science toolset summary from 2021 | dev.to | 2021-11-13

    Scikit-learn - It is one of the most widely used frameworks for Python based Data science tasks. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Link - https://scikit-learn.org/

  • Scout APM

    Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.

  • GitHub repo Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: Facial recognition | reddit.com/r/OSINT | 2021-10-18
  • GitHub repo faceswap

    Deepfakes Software For All

    Project mention: Use the infamous Deep Fakes project for things other than faces | reddit.com/r/learnmachinelearning | 2021-10-25

    My current challenge is getting those masked wheel images to be able to swap between images, or to apply a new wheel on a car image. To get a decent result that doesn't look fake, it would have to do some minor warping and resizing. To me, this seems like exactly what the Deep Fakes repo does. https://github.com/deepfakes/faceswap

  • GitHub repo gym

    A toolkit for developing and comparing reinforcement learning algorithms.

    Project mention: Ask HN: What would a reality show for Software Engineers look like? | news.ycombinator.com | 2021-11-29

    Very well put. You got me thinking about what I do as a software developer that might be considered entertaining. Maybe computer security events like capture the flag (https://ctftime.org/) or coding an AI agent in a simulated environment to achieve a goal (https://www.codingame.com, https://gym.openai.com/) or simply competing with others to solve an algorithmic problem either on time constraint or code length constraint.

    Without appealing visuals none of it will be interesting to people who don’t have a development background. Maybe generative art is the answer but IMO it is more about art than programming.

  • GitHub repo data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

    Project mention: Beginner in Python for Data Science | reddit.com/r/learnpython | 2020-12-27

    data science ipython notebooks

  • GitHub repo spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: Two Methods to Scan for PII in Data Warehouses | dev.to | 2021-11-29

    NLP libraries such as Stanford NER Detector and Spacy

  • Nanos

    Run Linux Software Faster and Safer than Linux with Unikernels.

  • GitHub repo NLP-progress

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

    Project mention: Upcoming App Announcement: Lemmatize, a Foreign Language Reader | reddit.com/r/languagelearning | 2021-11-11

    A standard step in Chinese text processing is word segmentation, which deals with this problem.

  • GitHub repo Ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

    Project mention: JORLDY: OpenSource Reinforcement Learning Framework | reddit.com/r/reinforcementlearning | 2021-11-08

    Distributed RL algorithms are provided using ray

  • GitHub repo PaddlePaddle

    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

    Project mention: I have issue with only __habs for half datatype? Please help! | reddit.com/r/CUDA | 2021-06-15
  • GitHub repo streamlit

    Streamlit — The fastest way to build data apps in Python

    Project mention: Suggestions for GUI framework for an app to browse tables of data, with buttons and dropdown menus in cells? And some related PySimpleGui questions | reddit.com/r/learnpython | 2021-11-07

    I've never used it, but someone suggested it in another thread, and it looked interesting to me, so I have it bookmarked to try-out: https://streamlit.io/

  • GitHub repo pytorch-lightning

    The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

    Project mention: [D] Colab TPU low performance | reddit.com/r/MachineLearning | 2021-11-18

    I wanted to make a quick performance comparison between the GPU (Tesla K80) and TPU (v2-8) available in Google Colab with PyTorch. To do so quickly, I used an MNIST example from pytorch-lightning that trains a simple CNN.

  • GitHub repo Prophet

    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

    Project mention: prophet: NEW Data - star count:13699.0 | reddit.com/r/algoprojects | 2021-11-26
  • GitHub repo EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

    Project mention: [Question] Best approach for Optical Character recognition on large (20MB+) photos? | reddit.com/r/opencv | 2021-11-10

    Try easyocr or Tesseract. Both are pretty easy to use and don't need much background in OpenCV.

  • GitHub repo rasa

    💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

    Project mention: How to Create the Perfect README for Your Open Source Project | dev.to | 2021-11-02

    This example is sourced from RasaHQ

  • GitHub repo gensim

    Topic Modelling for Humans

    Project mention: Gensim – a Python library for topic modelling, document indexing | news.ycombinator.com | 2021-11-25
  • GitHub repo jina

    Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data

    Project mention: Open source tools to track github repository stats? | reddit.com/r/opensource | 2021-10-24

    I use this tool everyday to track growth for Jina (an open-source neural search framework)

  • GitHub repo imgaug

    Image augmentation for machine learning experiments.

    Project mention: [N] Facebook AI Open Sources AugLy: A New Python Library For Data Augmentation To Develop Robust Machine Learning Models | reddit.com/r/MachineLearning | 2021-06-19

    https://github.com/aleju/imgaug This one is way better for image.

  • GitHub repo horovod

    Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

    Project mention: [D] GPU buying recommendation | reddit.com/r/MachineLearning | 2021-07-17

    If you just want to run tensorflow or pytorch for a Jupyter notebook, setting the environment shouldn't be difficult. I know that AWS has a marketplace of preconfigured images. However, you can go as advanced as setting up a cluster of gpu-equipped nodes to setup Horovod (https://github.com/horovod/horovod) to do distributed machine learning. Yes, there's a learning curve, but you cannot acquire this skillet any other way.

  • GitHub repo tensor2tensor

    Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

    Project mention: [D] Resources for Understanding The Original Transformer Paper | reddit.com/r/MachineLearning | 2021-09-08

    Code for https://arxiv.org/abs/1706.03762 found: https://github.com/tensorflow/tensor2tensor

  • GitHub repo ChatterBot

    ChatterBot is a machine learning, conversational dialog engine for creating chat bots

    Project mention: Just getting into Chatbot development in Python and having trouble choosing an API/Wrapper .. or Framework? | reddit.com/r/learnpython | 2021-10-30

    chatterbot

  • GitHub repo recommenders

    Best Practices on Recommendation Systems

    Project mention: Opinion on choice of model - Recommender System | reddit.com/r/datascience | 2021-04-10

    Then I tried to find some more advanced models and I found this really good list and in there I found the Microsoft one. So it's' where we are now, which a bunch of different models and not a documentation/tutorials out there.

  • GitHub repo d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge.

    Project mention: I created a way to learn machine learning through Jupyter | reddit.com/r/learnmachinelearning | 2021-04-30

    There are actually some online books and courses built on Jupyter Notebook ([Dive to Deep Learning Book](https://github.com/d2l-ai/d2l-en) for example). However yours is more detail and could really helps beginners.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-11-29.

Python Machine Learning related posts

Index

What are some of the best open-source Machine Learning projects in Python? This list will help you:

Project Stars
1 Keras 53,329
2 scikit-learn 48,081
3 Face Recognition 42,325
4 faceswap 39,700
5 gym 25,808
6 data-science-ipython-notebooks 21,875
7 spaCy 21,827
8 NLP-progress 19,419
9 Ray 18,270
10 PaddlePaddle 17,089
11 streamlit 16,661
12 pytorch-lightning 16,326
13 Prophet 13,740
14 EasyOCR 13,149
15 rasa 13,108
16 gensim 12,694
17 jina 12,306
18 imgaug 12,020
19 horovod 11,881
20 tensor2tensor 11,790
21 ChatterBot 11,745
22 recommenders 11,705
23 d2l-en 11,569
Find remote jobs at our new job board 99remotejobs.com. There are 33 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com