Top 23 scikit-learn Open-Source Projects
100 Days of ML CodingProject mention: The Ultimate Resource Guide for Your Next 100 Days of Code | dev.to | 2021-10-25
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for allProject mention: Top Github repo trends in 2021 | dev.to | 2022-01-12
three educational courses- Web Dev, ML, and IoT for beginners. Note re using educational resources as a strategy for marketing , at least the ML course links to various Azure services. Google does this a bunch as well, with Collab notebooks often being used to demo educational materials.
Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.
Python Data Science Handbook: full text in Jupyter NotebooksProject mention: Top 20 Free Machine Learning, Data Science And Python Books | dev.to | 2022-05-18
Read Here: Python Data Science Handbook eBook
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
The "Python Machine Learning (1st edition)" book code repository and info resourceProject mention: What is the purpose of meshgrid in Python / NumPy? | reddit.com/r/codehunter | 2022-01-06
I am studying "Python Machine Learning" from Sebastian Raschka, and he is using it for plotting the decision borders. See input 11 here.
Parallel computing with task schedulingProject mention: File format for large data with many columns | reddit.com/r/Python | 2022-05-15
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.Project mention: Best-Of Machine Learning with Python | news.ycombinator.com | 2022-04-28
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
Open Machine Learning CourseProject mention: mlcourse.ai: NEW Courses - star count:8200.0 | reddit.com/r/algoprojects | 2022-05-20
ONNX Runtime: cross-platform, high performance ML inferencing and training acceleratorProject mention: [Arch] pip doesn't have onnxruntime? | reddit.com/r/linuxquestions | 2022-03-06
It's not available for python 3.10.
Automated Machine Learning with scikit-learnProject mention: Why not AutoML every tabular data? | reddit.com/r/datascience | 2021-07-26
Efficiency Ignoring the feature engineering aspects aside, a typical data scientist workflow involves trying out the different models. Some of the AutoML modules like H2O AutoML, AutoSklearn does this for you, and allow you to interpret your models. All these save so much time experimenting with the standard models.
A unified framework for machine learning with time seriesProject mention: Forecasting three months ahead. | reddit.com/r/datascience | 2022-04-07
Fit interpretable models. Explain blackbox machine learning.Project mention: What Are the Most Important Statistical Ideas of the Past 50 Years? | news.ycombinator.com | 2022-02-21
You may also find Explainable Boosting Machines interesting: https://github.com/interpretml/interpret
They're a bit like a best of both worlds between linear models and random forests (generalized additive models fit with boosted decision trees)
Disclosure: I helped build this open source package
A scikit-learn compatible neural network library that wraps PyTorchProject mention: [P] ray-skorch - distributed PyTorch on Ray with sklearn API | reddit.com/r/MachineLearning | 2022-01-04
I'm the principal author of ray-skorch, a library that lets you run distributed PyTorch training on large-scale datasets while providing a familiar, scikit-learn compatible skorch API, integrating well with the rest of the scikit-learn ecosystem.
AutoGluon: AutoML for Image, Text, and Tabular DataProject mention: What will the data science job market be like in 5 years? | reddit.com/r/datascience | 2021-08-14
Some AutoML is getting pretty good, AutoGluon is very solid for tabular data. That being said you still need to have your data in tabular format and deployment still requires some effort.
Visual analysis and diagnostic tools to facilitate machine learning model selection.
🍊 :bar_chart: :bulb: Orange: Interactive data analysisProject mention: What software do you all use? | reddit.com/r/technicalwriting | 2022-03-27
Orange (machine learning / data mining) for reporting, text analysis. https://orangedatamining.com/
The "Python Machine Learning (3rd edition)" book code repositoryProject mention: What does %-*s do in a print statement? | reddit.com/r/learnpython | 2021-12-08
from Cell 53, here
a delightful machine learning tool that allows you to train, test, and use models without writing codeProject mention: Train/fit, test, and use models without writing code | reddit.com/r/ArtificialInteligence | 2021-06-29
Link to the repo: https://github.com/nidhaloff/igel
Hummingbird compiles trained ML models into tensor computation for faster inference.Project mention: Machine Learning with PyTorch and Scikit-Learn – The *New* Python ML Book | news.ycombinator.com | 2022-02-25
I think Rapids AI's cuML tried to go into this direction (essentially scikit-learn on the GPU): https://docs.rapids.ai/api/cuml/stable/api.html#logistic-reg.... For some reason it never took really off though.
Btw., going on a tangent, you might like Hummingbird (https://github.com/microsoft/hummingbird). It allows you trained scikit-learn tree-based models to PyTorch. I watched the SciPy talk last year, and it's a super smart & elegant idea.
🛠 All-in-one web-based IDE specialized for machine learning and data science.Project mention: All-in-One Docker Based IDE for Data Science and ML | news.ycombinator.com | 2021-09-24
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
scikit-learn related posts
library / framework to test multiple sklearn regression models at once
3 projects | reddit.com/r/algotrading | 19 May 2022
Top 20 Free Machine Learning, Data Science And Python Books
2 projects | dev.to | 18 May 2022
Best courses for aspring Data Analysts on Udemy? (No computer science background). Any recommendations?
1 project | reddit.com/r/datascience | 17 May 2022
Best resource to learn Python for Data Science?
2 projects | reddit.com/r/learnpython | 12 May 2022
Python Data Science Handbook (Free O'Reilly Book)
1 project | news.ycombinator.com | 8 May 2022
Free books for machine learning and deep learning
1 project | reddit.com/r/AI_Trends | 7 May 2022
Zama Open-Sources Concrete ML v0.2 To Support Data Scientists Without Any Prior Cryptography Knowledge To Automatically Turn Classical Machine Learning (ML) Models Into Their FHE Equivalent
1 project | reddit.com/r/artificial | 2 May 2022
What are some of the best open-source scikit-learn projects? This list will help you:
Are you hiring? Post a new remote job listing for free.