Open-source projects categorized as scikit-learn | Edit details
Language filter: + Python + Jupyter Notebook + C++

Top 23 scikit-learn Open-Source Projects

  • 100-Days-Of-ML-Code

    100 Days of ML Coding

    Project mention: The Ultimate Resource Guide for Your Next 100 Days of Code | | 2021-10-25

    ML: 100-Days-Of-ML-Code

  • ML-For-Beginners

    12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

    Project mention: Top Github repo trends in 2021 | | 2022-01-12

    three educational courses- Web Dev, ML, and IoT for beginners. Note re using educational resources as a strategy for marketing , at least the ML course links to various Azure services. Google does this a bunch as well, with Collab notebooks often being used to demo educational materials.

  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • PythonDataScienceHandbook

    Python Data Science Handbook: full text in Jupyter Notebooks

    Project mention: Top 20 Free Machine Learning, Data Science And Python Books | | 2022-05-18

    Read Here: Python Data Science Handbook eBook

  • data-science-ipython-notebooks

    Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • python-machine-learning-book

    The "Python Machine Learning (1st edition)" book code repository and info resource

    Project mention: What is the purpose of meshgrid in Python / NumPy? | | 2022-01-06

    I am studying "Python Machine Learning" from Sebastian Raschka, and he is using it for plotting the decision borders. See input 11 here.

  • Dask

    Parallel computing with task scheduling

    Project mention: File format for large data with many columns | | 2022-05-15
  • best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

    Project mention: Best-Of Machine Learning with Python | | 2022-04-28
  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.


    Open Machine Learning Course

    Project mention: NEW Courses - star count:8200.0 | | 2022-05-20
  • onnxruntime

    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

    Project mention: [Arch] pip doesn't have onnxruntime? | | 2022-03-06

    It's not available for python 3.10.

  • auto-sklearn

    Automated Machine Learning with scikit-learn

    Project mention: Why not AutoML every tabular data? | | 2021-07-26

    Efficiency Ignoring the feature engineering aspects aside, a typical data scientist workflow involves trying out the different models. Some of the AutoML modules like H2O AutoML, AutoSklearn does this for you, and allow you to interpret your models. All these save so much time experimenting with the standard models.

  • sktime

    A unified framework for machine learning with time series

    Project mention: Forecasting three months ahead. | | 2022-04-07
  • interpret

    Fit interpretable models. Explain blackbox machine learning.

    Project mention: What Are the Most Important Statistical Ideas of the Past 50 Years? | | 2022-02-21

    You may also find Explainable Boosting Machines interesting:

    They're a bit like a best of both worlds between linear models and random forests (generalized additive models fit with boosted decision trees)

    Disclosure: I helped build this open source package

  • skorch

    A scikit-learn compatible neural network library that wraps PyTorch

    Project mention: [P] ray-skorch - distributed PyTorch on Ray with sklearn API | | 2022-01-04

    I'm the principal author of ray-skorch, a library that lets you run distributed PyTorch training on large-scale datasets while providing a familiar, scikit-learn compatible skorch API, integrating well with the rest of the scikit-learn ecosystem.

  • autogluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    Project mention: What will the data science job market be like in 5 years? | | 2021-08-14

    Some AutoML is getting pretty good, AutoGluon is very solid for tabular data. That being said you still need to have your data in tabular format and deployment still requires some effort.

  • yellowbrick

    Visual analysis and diagnostic tools to facilitate machine learning model selection.

  • orange

    🍊 :bar_chart: :bulb: Orange: Interactive data analysis

    Project mention: What software do you all use? | | 2022-03-27

    Orange (machine learning / data mining) for reporting, text analysis.

  • python-machine-learning-book-3rd-edition

    The "Python Machine Learning (3rd edition)" book code repository

    Project mention: What does %-*s do in a print statement? | | 2021-12-08

    from Cell 53, here

  • igel

    a delightful machine learning tool that allows you to train, test, and use models without writing code

    Project mention: Train/fit, test, and use models without writing code | | 2021-06-29

    Link to the repo:

  • hummingbird

    Hummingbird compiles trained ML models into tensor computation for faster inference.

    Project mention: Machine Learning with PyTorch and Scikit-Learn – The *New* Python ML Book | | 2022-02-25

    I think Rapids AI's cuML tried to go into this direction (essentially scikit-learn on the GPU): For some reason it never took really off though.

    Btw., going on a tangent, you might like Hummingbird ( It allows you trained scikit-learn tree-based models to PyTorch. I watched the SciPy talk last year, and it's a super smart & elegant idea.

  • ML-Workspace

    🛠 All-in-one web-based IDE specialized for machine learning and data science.

    Project mention: All-in-One Docker Based IDE for Data Science and ML | | 2021-09-24
  • eli5

    A library for debugging/inspecting machine learning classifiers and explaining their predictions

  • mars

    Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

  • m2cgen

    Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-05-20.

scikit-learn related posts


What are some of the best open-source scikit-learn projects? This list will help you:

Project Stars
1 100-Days-Of-ML-Code 37,261
2 ML-For-Beginners 36,114
3 PythonDataScienceHandbook 34,564
4 data-science-ipython-notebooks 23,080
5 python-machine-learning-book 11,568
6 Dask 9,890
7 best-of-ml-python 9,563
8 8,199
9 onnxruntime 6,721
10 auto-sklearn 6,275
11 sktime 5,321
12 interpret 4,733
13 skorch 4,493
14 autogluon 4,471
15 yellowbrick 3,606
16 orange 3,408
17 python-machine-learning-book-3rd-edition 3,183
18 igel 2,980
19 hummingbird 2,812
20 ML-Workspace 2,545
21 eli5 2,536
22 mars 2,412
23 m2cgen 2,094
Find remote jobs at our new job board There are 7 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives