audiophile-e2e-pipeline
StravaDataPipline
audiophile-e2e-pipeline | StravaDataPipline | |
---|---|---|
3 | 1 | |
170 | 28 | |
- | - | |
0.0 | 6.0 | |
over 1 year ago | almost 2 years ago | |
Python | Python | |
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
audiophile-e2e-pipeline
- Where can I find online projects end-to-end?
-
Celebrating my first Data Engineering Project -- Fitbit data with PySpark, GCP, prefect, and terraform!
ris-tlp adiophile-e2e-pipeline
- Built and automated a complete end-to-end ELT pipeline using AWS, Airflow, dbt, Terraform, Metabase and more as a beginner project!
StravaDataPipline
-
ELT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow
The GitHub repo can be found here: https://github.com/jackmleitch/StravaDataPipline A corresponding blog post can also be found here: https://jackmleitch.com/blog/Strava-Data-Pipeline
What are some alternatives?
data-engineering-zoomcamp - Free Data Engineering course!
Udacity-Data-Engineering-Projects - Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
ghcn-d - Data Pipeline from the Global Historical Climatology Network DataSet
airflow-docker - This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and workflows.
Reddit-API-Pipeline
versatile-data-kit - One framework to develop, deploy and operate data workflows with Python and SQL.
data_engineering_project_1 - My first attempt at a rough ETL pipeline; technologies include spark, GCS, prefect orchestration, and terraform
spotify-api - Pipeline that extracts data from the Spotify API to build a more detailed version of Spotify Wrapped
stream-iot - An end-to-end workflow for processing streaming data on Azure.
Skytrax-Data-Warehouse - A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.
streamify - A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
pydantic - Data validation using Python type hints