Top 23 scikit-learn Open-Source Projects
-
ML: 100-Days-Of-ML-Code
-
three educational courses- Web Dev, ML, and IoT for beginners. Note re using educational resources as a strategy for marketing , at least the ML course links to various Azure services. Google does this a bunch as well, with Collab notebooks often being used to demo educational materials.
-
SonarQube
Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.
-
Read Here: Python Data Science Handbook eBook
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
Project mention: What is the purpose of meshgrid in Python / NumPy? | reddit.com/r/codehunter | 2022-01-06I am studying "Python Machine Learning" from Sebastian Raschka, and he is using it for plotting the decision borders. See input 11 here.
-
-
-
Scout APM
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
-
Project mention: mlcourse.ai: NEW Courses - star count:8200.0 | reddit.com/r/algoprojects | 2022-05-20
-
It's not available for python 3.10.
-
Efficiency Ignoring the feature engineering aspects aside, a typical data scientist workflow involves trying out the different models. Some of the AutoML modules like H2O AutoML, AutoSklearn does this for you, and allow you to interpret your models. All these save so much time experimenting with the standard models.
-
-
Project mention: What Are the Most Important Statistical Ideas of the Past 50 Years? | news.ycombinator.com | 2022-02-21
You may also find Explainable Boosting Machines interesting: https://github.com/interpretml/interpret
They're a bit like a best of both worlds between linear models and random forests (generalized additive models fit with boosted decision trees)
Disclosure: I helped build this open source package
-
Project mention: [P] ray-skorch - distributed PyTorch on Ray with sklearn API | reddit.com/r/MachineLearning | 2022-01-04
I'm the principal author of ray-skorch, a library that lets you run distributed PyTorch training on large-scale datasets while providing a familiar, scikit-learn compatible skorch API, integrating well with the rest of the scikit-learn ecosystem.
-
Project mention: What will the data science job market be like in 5 years? | reddit.com/r/datascience | 2021-08-14
Some AutoML is getting pretty good, AutoGluon is very solid for tabular data. That being said you still need to have your data in tabular format and deployment still requires some effort.
-
-
Orange (machine learning / data mining) for reporting, text analysis. https://orangedatamining.com/
-
python-machine-learning-book-3rd-edition
The "Python Machine Learning (3rd edition)" book code repository
from Cell 53, here
-
igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
Project mention: Train/fit, test, and use models without writing code | reddit.com/r/ArtificialInteligence | 2021-06-29Link to the repo: https://github.com/nidhaloff/igel
-
Project mention: Machine Learning with PyTorch and Scikit-Learn – The *New* Python ML Book | news.ycombinator.com | 2022-02-25
I think Rapids AI's cuML tried to go into this direction (essentially scikit-learn on the GPU): https://docs.rapids.ai/api/cuml/stable/api.html#logistic-reg.... For some reason it never took really off though.
Btw., going on a tangent, you might like Hummingbird (https://github.com/microsoft/hummingbird). It allows you trained scikit-learn tree-based models to PyTorch. I watched the SciPy talk last year, and it's a super smart & elegant idea.
-
Project mention: All-in-One Docker Based IDE for Data Science and ML | news.ycombinator.com | 2021-09-24
-
eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
-
mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
-
m2cgen
Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
scikit-learn related posts
- library / framework to test multiple sklearn regression models at once
- Top 20 Free Machine Learning, Data Science And Python Books
- Best courses for aspring Data Analysts on Udemy? (No computer science background). Any recommendations?
- Best resource to learn Python for Data Science?
- Python Data Science Handbook (Free O'Reilly Book)
- Free books for machine learning and deep learning
- Zama Open-Sources Concrete ML v0.2 To Support Data Scientists Without Any Prior Cryptography Knowledge To Automatically Turn Classical Machine Learning (ML) Models Into Their FHE Equivalent
Index
What are some of the best open-source scikit-learn projects? This list will help you:
Project | Stars | |
---|---|---|
1 | 100-Days-Of-ML-Code | 37,261 |
2 | ML-For-Beginners | 36,114 |
3 | PythonDataScienceHandbook | 34,564 |
4 | data-science-ipython-notebooks | 23,080 |
5 | python-machine-learning-book | 11,568 |
6 | Dask | 9,890 |
7 | best-of-ml-python | 9,563 |
8 | mlcourse.ai | 8,199 |
9 | onnxruntime | 6,721 |
10 | auto-sklearn | 6,275 |
11 | sktime | 5,321 |
12 | interpret | 4,733 |
13 | skorch | 4,493 |
14 | autogluon | 4,471 |
15 | yellowbrick | 3,606 |
16 | orange | 3,408 |
17 | python-machine-learning-book-3rd-edition | 3,183 |
18 | igel | 2,980 |
19 | hummingbird | 2,812 |
20 | ML-Workspace | 2,545 |
21 | eli5 | 2,536 |
22 | mars | 2,412 |
23 | m2cgen | 2,094 |
Are you hiring? Post a new remote job listing for free.