Jupyter Notebook Data Science

Open-source Jupyter Notebook projects categorized as Data Science

Top 23 Jupyter Notebook Data Science Projects

  • Made-With-ML

    Learn how to design, develop, deploy and iterate on production-grade ML applications.

    Project mention: [D] How do you keep up to date on Machine Learning? | /r/learnmachinelearning | 2023-08-13

    Made With ML

  • Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

    aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

    Project mention: Probabilistic Programming and Bayesian Methods for Hackers (2013) | news.ycombinator.com | 2024-02-10
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • Data-Science-For-Beginners

    10 Weeks, 20 Lessons, Data Science for All!

    Project mention: Welcome to 14 days of Data Science! | dev.to | 2024-03-07

    Get started with Data Science in the Data Science for Beginners curricula.

  • fastbook

    The fastai book, published as Jupyter Notebooks

    Project mention: The fastai book, published as Jupyter Notebooks | news.ycombinator.com | 2024-01-17
  • python-machine-learning-book

    The "Python Machine Learning (1st edition)" book code repository and info resource

  • machine-learning-for-trading

    Code for Machine Learning for Algorithmic Trading, 2nd edition.

    Project mention: Machine Learning for Trading: Notebooks, resources and references accompanying the book Machine Learning for Algorithmic Trading. Courses - star count:10678.0 | /r/algoprojects | 2023-11-20
  • numerical-linear-algebra

    Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course

    Project mention: I'm a 42-years-old librarian whithout any math background and I'm willing to learn | /r/learnmachinelearning | 2023-04-27

    If you really like to dig into math, I liked the Udacity course on Intro to Deeplearning with Pytorch. Also, the Stanford course CS231n Convolutional Neural Networks for Visual Recognition is a good place to understand some basics. Other two courses to get you jumpstarted are Practical Deep Learning for Coders and Linear Algebra Course by FastAI

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • amazon-sagemaker-examples

    Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

    Project mention: Thesis Project Help Using SageMaker Free Tier | /r/aws | 2023-09-23

    I need to use AWS Sagemaker (required, can't use easier services) and my adviser gave me this document to start with: https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/question_answering_retrieval_augmented_generation/question_answering_langchain_jumpstart.ipynb

  • ML-Papers-of-the-Week

    🔥Highlighting the top ML papers every week.

    Project mention: [D] Where can I find a list of the foundational academic papers in RL/ML/DL and what are your go-to places to find new academic papers in RL/ML/DL? | /r/MachineLearning | 2023-07-07

    Labml.ai stopped working in May. I like https://github.com/dair-ai/ML-Papers-of-the-Week

  • pycaret

    An open-source, low-code machine learning library in Python

  • tsfresh

    Automatic extraction of relevant features from time series:

    Project mention: For deep learning practitioners in industry, is the workflow always this annoying? [D] | /r/MachineLearning | 2023-07-10

    This is definitely a good thing to try for time-series; you can automate your feature extraction too (eg using https://github.com/blue-yonder/tsfresh ).

  • H2O

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

    Project mention: Really struggling with open source models | /r/LocalLLaMA | 2023-07-12

    I would use H20 if I were you. You can try out LLMs with a nice GUI. Unless you have some familiarity with the tools needed to run these projects, it can be frustrating. https://h2o.ai/

  • evidently

    Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b

    Project mention: [P] Free open-source ML observability course: starts October 16 🚀 | /r/MachineLearning | 2023-10-15

    Hi everyone, I’m one of the creators of Evidently, an open-source (Apache 2.0) tool for production ML monitoring. We’ve just launched a free open course on ML observability that I wanted to share with the community.

  • machine_learning_complete

    A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.

  • nlpaug

    Data augmentation for NLP

  • probability

    Probabilistic reasoning and statistical analysis in TensorFlow

    Project mention: How often do you see Bayesian Statistics or Stan in the DS world? Essential skill or a nice to have? | /r/datascience | 2023-06-17


  • Data-science

    Collection of useful data science topics along with articles, videos, and code (by khuyentran1401)

  • MachineLearningNotebooks

    Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft

    Project mention: Multiple model loading on a Online Fully managed endpoint | /r/AZURE | 2023-04-21

    I found an example using the python SDK v2:


    A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.

    Project mention: AutoGen: Enabling Next-Gen GPT-X Applications | news.ycombinator.com | 2023-08-22

    I really like the simplicity of this framework, and they hit on a lot of common problems found in other agent-based frameworks. Most intrigued by the RAG improvements.

    Seems like Microsoft was frustrated with the pace of movement in this space and the shitty results of agents (which admittedly kept my interest turned away from agents for the last few months). I'm interested again because it makes practical sense, and from looking at the example notebooks, seems fairly easy to integrate into existing applications.

    Maybe this is the 'low code' approach that might actually work, and bridge together engineering and non-engineering resources.

    This example was what caught my eye: https://github.com/microsoft/FLAML/blob/main/notebook/autoge...

  • python-training

    Python training for business analysts and traders

  • course-nlp

    A Code-First Introduction to NLP course

  • ML-Workspace

    🛠 All-in-one web-based IDE specialized for machine learning and data science.

  • cracking-the-data-science-interview

    A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep

    Project mention: Can someone recommend some website for data science interview preparation | /r/datascience | 2023-06-02
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-07.

Jupyter Notebook Data Science related posts


What are some of the best open-source Data Science projects in Jupyter Notebook? This list will help you:

Project Stars
1 Made-With-ML 35,567
2 Probabilistic-Programming-and-Bayesian-Methods-for-Hackers 26,321
3 Data-Science-For-Beginners 26,230
4 fastbook 20,607
5 python-machine-learning-book 12,076
6 machine-learning-for-trading 11,714
7 numerical-linear-algebra 9,988
8 amazon-sagemaker-examples 9,477
9 ML-Papers-of-the-Week 8,609
10 pycaret 8,364
11 tsfresh 8,064
12 H2O 6,705
13 evidently 4,591
14 machine_learning_complete 4,476
15 nlpaug 4,252
16 probability 4,126
17 Data-science 3,946
18 MachineLearningNotebooks 3,939
19 FLAML 3,663
20 python-training 3,418
21 course-nlp 3,390
22 ML-Workspace 3,315
23 cracking-the-data-science-interview 3,158
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives