The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 11 Jupyter Notebook data-engineering Projects
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
hamilton
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
-
uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
pyspark-tutorial
PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformations and Actions, Spark DataFrame, Spark SQL, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites. (by coder2j)
Project mention: [D] How do you keep up to date on Machine Learning? | /r/learnmachinelearning | 2023-08-13Made With ML
References: Data engineering zoomcamp week 6 course and homework notes: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/cohorts/2024/06-streaming
Note that this uses simple OpenAI calls — you can replace this with Langchain, LlamaIndex, Hamilton (or something else) if you prefer more abstraction, and delegate to whatever LLM you like to use. And, you should probably use something a little more concrete (E.G. instructor) to guarantee output shape.
Project mention: Show HN: Hands-On Data Engineering with a Real-Estate Project Guide | news.ycombinator.com | 2024-03-20
Watch it now 👉 https://youtu.be/EB8lfdxpirM GitHub Repo 👉 https://github.com/coder2j/pyspark-tutorial
Jupyter Notebook data-engineering related posts
- Data Engineering Zoomcamp Week 6 - using redpanda 1
- Final project part 5
- Show HN: Hands-On Data Engineering with a Real-Estate Project Guide
- Using IPython Jupyter Magic commands to improve the notebook experience
- Building a project in DBT
- Testing and documenting DBT models
- Extracting data with dlt
-
A note from our sponsor - WorkOS
workos.com | 27 Apr 2024
Index
What are some of the best open-source data-engineering projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | Made-With-ML | 35,656 |
2 | data-engineering-zoomcamp | 22,446 |
3 | hamilton | 1,312 |
4 | quilt | 1,313 |
5 | Data-Engineering-Projects | 637 |
6 | practical-data-engineering | 449 |
7 | uber-expenses-tracking | 94 |
8 | pyspark-tutorial | 28 |
9 | 60-Days-of-Data-Science-and-ML | 22 |
10 | Data-Engineering-Portfolio | 12 |
11 | data-engineering-nd | 8 |
Sponsored