SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 scikit-learn Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
-
machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
-
python-machine-learning-book-3rd-edition
The "Python Machine Learning (3rd edition)" book code repository
-
superduperdb
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
- https://github.com/microsoft/ML-For-Beginners
Also check out this list Pitt puts out every year:
Project mention: Top 10 GitHub Repositories Every Developer Should Bookmark in 2024 | dev.to | 2024-02-072) 100 Days of ML Code: Embark on a 100-day journey into the fascinating world of machine learning with this structured curriculum. Packed with bite-sized coding challenges and real-world projects, this repository will transform you from a coding novice to a confident ML enthusiast. (https://github.com/Avik-Jain/100-Days-Of-ML-Code)
Project mention: About Data analyst, data scientist and data engineer, resources and experiences | dev.to | 2024-03-26Python Data Science Handbook
ONNX Runtime: ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Project mention: pip install remyxai - easiest way to create custom vision models | /r/computervision | 2023-04-25This seems not very convincing. There are other popular frameworks that provide AutoML with existing datasets (eg https://github.com/autogluon/autogluon)
Project mention: Featuretools – A Python Library for Automated Feature Engineering | news.ycombinator.com | 2023-09-20
I know I've tooted its horn before, but Orange3 is a pretty neat Python-based GUI platform that makes this and a metric buttload of other statistical/ML techniques available to non-programmer types.
Just watch out for null character `x00` in the corpus. That always seems to kill it stone dead.
https://orangedatamining.com/
https://orange3.readthedocs.io/projects/orange-visual-progra...
Project mention: Pyenv – lets you easily switch between multiple versions of Python | news.ycombinator.com | 2024-03-25We use Pyenv successfully for developing the Flower open-source project. We use a few simple Bash scripts to manage virtual environments with different Python versions via pyenv and the pyenv-virtualenv plugin.
The main scripts are `venv-create.sh`, `venv-delete.sh` and `bootstrap.sh`. `venv-reset.sh` pulls these three scripts together to make reinstalling your venv a single command.
Here's the link if anyone is interested: https://github.com/adap/flower/tree/main/dev
I really like the simplicity of this framework, and they hit on a lot of common problems found in other agent-based frameworks. Most intrigued by the RAG improvements.
Seems like Microsoft was frustrated with the pace of movement in this space and the shitty results of agents (which admittedly kept my interest turned away from agents for the last few months). I'm interested again because it makes practical sense, and from looking at the example notebooks, seems fairly easy to integrate into existing applications.
Maybe this is the 'low code' approach that might actually work, and bridge together engineering and non-engineering resources.
This example was what caught my eye: https://github.com/microsoft/FLAML/blob/main/notebook/autoge...
scikit-learn related posts
- About Data analyst, data scientist and data engineer, resources and experiences
- Show HN: Logistic Regression Training on Encrypted Data with FHE
- Implementing a ChatGPT-like LLM from scratch, step by step
- Training ML Models on Encrypted Data with Homomorphic Encryption (FHE)
- AlphaPy: machine learning framework built on sklearn and pandas. Support pyfolio/xgboost/lightgmb/catboost(gradient boosting on decision tress) etc. Examples include financial market prediction/sports prediction/kaggle. Configurations are set though
- Tradero: A tool for achieving self-funding via trading
- Scikit-learn Stock Prediction: using fundamental and pricing data to predict future stock returns. Sklearn's randomforest classifier is trainded and author claimed positive live trading results. Not actively mainained Other Models - star count:1520.0
-
A note from our sponsor - SaaSHub
www.saashub.com | 23 Apr 2024
Index
What are some of the best open-source scikit-learn projects? This list will help you:
Project | Stars | |
---|---|---|
1 | ML-For-Beginners | 66,806 |
2 | 100-Days-Of-ML-Code | 43,200 |
3 | PythonDataScienceHandbook | 41,407 |
4 | data-science-ipython-notebooks | 26,459 |
5 | handson-ml | 25,090 |
6 | best-of-ml-python | 15,302 |
7 | onnxruntime | 12,656 |
8 | python-machine-learning-book | 12,076 |
9 | Dask | 11,982 |
10 | mlcourse.ai | 9,390 |
11 | auto-sklearn | 7,394 |
12 | sktime | 7,387 |
13 | autogluon | 7,091 |
14 | featuretools | 7,017 |
15 | interpret | 5,988 |
16 | skorch | 5,614 |
17 | orange | 4,604 |
18 | machine_learning_complete | 4,501 |
19 | python-machine-learning-book-3rd-edition | 4,386 |
20 | superduperdb | 4,327 |
21 | yellowbrick | 4,194 |
22 | flower | 4,166 |
23 | FLAML | 3,671 |
Sponsored