Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Jupyter Notebook Data Science Projects
-
Project mention: [D] How do you keep up to date on Machine Learning? | /r/learnmachinelearning | 2023-08-13
Made With ML
-
Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Project mention: Probabilistic Programming and Bayesian Methods for Hackers (2013) | news.ycombinator.com | 2024-02-10 -
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Get started with Data Science in the Data Science for Beginners curricula.
-
Project mention: The fastai book, published as Jupyter Notebooks | news.ycombinator.com | 2024-01-17
-
python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
-
Project mention: Machine Learning for Trading: Notebooks, resources and references accompanying the book Machine Learning for Algorithmic Trading. Courses - star count:10678.0 | /r/algoprojects | 2023-11-20
-
numerical-linear-algebra
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
Project mention: I'm a 42-years-old librarian whithout any math background and I'm willing to learn | /r/learnmachinelearning | 2023-04-27If you really like to dig into math, I liked the Udacity course on Intro to Deeplearning with Pytorch. Also, the Stanford course CS231n Convolutional Neural Networks for Visual Recognition is a good place to understand some basics. Other two courses to get you jumpstarted are Practical Deep Learning for Coders and Linear Algebra Course by FastAI
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
amazon-sagemaker-examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
I need to use AWS Sagemaker (required, can't use easier services) and my adviser gave me this document to start with: https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/question_answering_retrieval_augmented_generation/question_answering_langchain_jumpstart.ipynb
-
Project mention: [D] Where can I find a list of the foundational academic papers in RL/ML/DL and what are your go-to places to find new academic papers in RL/ML/DL? | /r/MachineLearning | 2023-07-07
Labml.ai stopped working in May. I like https://github.com/dair-ai/ML-Papers-of-the-Week
-
-
Project mention: For deep learning practitioners in industry, is the workflow always this annoying? [D] | /r/MachineLearning | 2023-07-10
This is definitely a good thing to try for time-series; you can automate your feature extraction too (eg using https://github.com/blue-yonder/tsfresh ).
-
H2O
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
I would use H20 if I were you. You can try out LLMs with a nice GUI. Unless you have some familiarity with the tools needed to run these projects, it can be frustrating. https://h2o.ai/
-
evidently
Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
Project mention: [P] Free open-source ML observability course: starts October 16 🚀 | /r/MachineLearning | 2023-10-15Hi everyone, I’m one of the creators of Evidently, an open-source (Apache 2.0) tool for production ML monitoring. We’ve just launched a free open course on ML observability that I wanted to share with the community.
-
machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
-
-
Project mention: How often do you see Bayesian Statistics or Stan in the DS world? Essential skill or a nice to have? | /r/datascience | 2023-06-17
TensorFlow-Probability
-
MachineLearningNotebooks
Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
I found an example using the python SDK v2:
-
Data-science
Collection of useful data science topics along with articles, videos, and code (by khuyentran1401)
-
I really like the simplicity of this framework, and they hit on a lot of common problems found in other agent-based frameworks. Most intrigued by the RAG improvements.
Seems like Microsoft was frustrated with the pace of movement in this space and the shitty results of agents (which admittedly kept my interest turned away from agents for the last few months). I'm interested again because it makes practical sense, and from looking at the example notebooks, seems fairly easy to integrate into existing applications.
Maybe this is the 'low code' approach that might actually work, and bridge together engineering and non-engineering resources.
This example was what caught my eye: https://github.com/microsoft/FLAML/blob/main/notebook/autoge...
-
-
-
-
cracking-the-data-science-interview
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Project mention: Can someone recommend some website for data science interview preparation | /r/datascience | 2023-06-02 -
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Jupyter Notebook Data Science related posts
- Welcome to 14 days of Data Science!
- Using IPython Jupyter Magic commands to improve the notebook experience
- Bayesian Analysis with Python
- Probabilistic Programming and Bayesian Methods for Hackers (2013)
- Show HN: Logistic Regression Training on Encrypted Data with FHE
- Training ML Models on Encrypted Data with Homomorphic Encryption (FHE)
- The fastai book, published as Jupyter Notebooks
-
A note from our sponsor - InfluxDB
www.influxdata.com | 17 Apr 2024
Index
What are some of the best open-source Data Science projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | Made-With-ML | 35,567 |
2 | Probabilistic-Programming-and-Bayesian-Methods-for-Hackers | 26,321 |
3 | Data-Science-For-Beginners | 26,230 |
4 | fastbook | 20,658 |
5 | python-machine-learning-book | 12,076 |
6 | machine-learning-for-trading | 11,750 |
7 | numerical-linear-algebra | 9,993 |
8 | amazon-sagemaker-examples | 9,477 |
9 | ML-Papers-of-the-Week | 8,647 |
10 | pycaret | 8,364 |
11 | tsfresh | 8,068 |
12 | H2O | 6,705 |
13 | evidently | 4,591 |
14 | machine_learning_complete | 4,497 |
15 | nlpaug | 4,252 |
16 | probability | 4,128 |
17 | MachineLearningNotebooks | 3,949 |
18 | Data-science | 3,946 |
19 | FLAML | 3,663 |
20 | python-training | 3,418 |
21 | course-nlp | 3,390 |
22 | ML-Workspace | 3,317 |
23 | cracking-the-data-science-interview | 3,164 |