Top 23 Python Machine learning Projects
-
keras
Deep Learning for humans
Then it's maybe a version bug problem, try to update to the latest tensorflow and keras version. It seems to appear in this issue and hasn't been resolved, switch to Pytorch maybe ?
-
scikit-learn
scikit-learn: machine learning in Python
The return values in make_forge are from this line: X, y = make_blobs(centers=2, random_state=4, n_samples=30). The make_blobs function is imported from sklearn.datasets, and there the return values are as such:
-
face_recognition
The world's simplest facial recognition api for Python and the command line
Latest mention: How to run PoseNet model and save data points for multiple images? | reddit.com/r/MLQuestions | 2021-01-15You feed the 17 landmark pairs as 34 inputs into a feed forward regression network (last time I did this I just tweaked something like this) with 34 pairs. However, it can help a lot to also add in the face detection to speed this up. So if you were to combine the landmarks from face detection as well like this face recognition and expand up to include the landmarks which helps I've found.
-
faceswap
Deepfakes Software For All
Latest mention: It's alright Yagoo, Sora will forever be seiso. | reddit.com/r/Hololive | 2021-01-10That's because deepfake has become a catch-all term for these kinds of edits. FOMM (First Order Motion Model) is what's used for these which is easier to use than software like DeepFaceLab or faceswap. They require trained models for every face you want to swap while FOMM's pre-trained models only requires an image and a video.
-
gym
A toolkit for developing and comparing reinforcement learning algorithms.
Latest mention: deep reinforcement learning for non-game application | reddit.com/r/reinforcementlearning | 2021-01-24https://github.com/openai/gym/blob/master/docs/environments.md#third-party-environments
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Latest mention: Learning webscraping, data analysis, and visualization, where should I start? | reddit.com/r/learnpython | 2021-01-23data science ipython notebooks
-
spaCy
💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython
-
ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Latest mention: JAX Implementations of Actor-Critic Algorithms | reddit.com/r/reinforcementlearning | 2021-01-10Folks like me using RLLib have observed this behavior: https://github.com/ray-project/ray/issues/12494
-
Paddle
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
-
prophet
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
For forecasting on time-series data I can recommend https://github.com/facebook/prophet
-
gensim
Topic Modelling for Humans
Latest mention: Koan: A word2vec negative sampling implementation with correct CBOW update | news.ycombinator.com | 2021-01-02Apparently it did: https://github.com/RaRe-Technologies/gensim/issues/1873
-
pytorch-lightning
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
Latest mention: Nicest and cleanest Deep Learning codebases out there | reddit.com/r/deeplearning | 2021-01-22When I look at the pytorch lightning animation, the stuff on the left for me is easy to follow and the code on the right formatted into classes is hard. My goal is to to start thinking and coding more like the code on the right. What I typically find hard with reading through code where everything is inside classes, methods, functions, decorators etc (i.e. the code on the right) is that there will be a place that executes all these methods in a linear way, but I keep having to scroll up to the class to see what it is actually doing. On the left I can just read through the code top to bottom. I even find myself copying the code out of classes the first time I read it so it executes like the code on the left :P I feel like what I'm doing is the equivalent of typing with only my index fingers…
-
EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
-
nltk
NLTK Source
-
tflearn
Deep learning library featuring a higher-level API for TensorFlow.
-
awesome-aws
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
-
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
microsoft / nni
-
bert-as-service
Mapping a variable-length sentence to a fixed-length vector using BERT model
Latest mention: Needed 100% to pass a safety quiz, need to wait a week to retake | reddit.com/r/mildlyinfuriating | 2021-01-12You joke but
-
fashion-mnist
A MNIST-like fashion product database. Benchmark :point_right:
Latest mention: [P] Why are stacked autoencoders still a thing? | reddit.com/r/MachineLearning | 2021-01-25fashion-mnist
-
mlflow
Open source platform for the machine learning lifecycle
Latest mention: If your team does ML, what is your "MLOps" stack? | reddit.com/r/devops | 2021-01-04MLFlow seemed pretty compelling as well, though I haven't had a chance to really play with it. I hear good things from others on my team.
-
pattern
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
-
dvc
🦉Data Version Control | Git for Data & Models
Latest mention: [P] Datasets should behave like Git repositories | reddit.com/r/MachineLearning | 2021-01-19 -
nupic
Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
Index
What are some of the best open-source Machine learning projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | keras | 50,548 |
2 | scikit-learn | 43,902 |
3 | face_recognition | 38,202 |
4 | faceswap | 33,988 |
5 | gym | 23,241 |
6 | data-science-ipython-notebooks | 20,109 |
7 | spaCy | 18,095 |
8 | ray | 14,527 |
9 | Paddle | 13,839 |
10 | prophet | 12,127 |
11 | gensim | 11,612 |
12 | pytorch-lightning | 11,530 |
13 | EasyOCR | 10,086 |
14 | nltk | 9,563 |
15 | tflearn | 9,505 |
16 | awesome-aws | 8,841 |
17 | nni | 8,798 |
18 | bert-as-service | 8,791 |
19 | fashion-mnist | 8,704 |
20 | mlflow | 8,240 |
21 | pattern | 7,755 |
22 | dvc | 7,086 |
23 | nupic | 6,189 |