weather_data_pipeline
dataproc-templates
weather_data_pipeline | dataproc-templates | |
---|---|---|
1 | 1 | |
3 | 111 | |
- | 4.5% | |
4.2 | 8.7 | |
about 1 year ago | 2 days ago | |
Python | Python | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
weather_data_pipeline
-
Building a Weather Data Pipeline with PySpark, Prefect, and Google Cloud
We'll be using PySpark for distributed data processing, Prefect for workflow management, and Google Cloud Storage and BigQuery for data storage and processing.The code is available on github.
dataproc-templates
What are some alternatives?
magic-the-gathering - A complete pipeline to pull data from Scryfall's "Magic: The Gathering"-API, via Prefect orchestration and dbt transformation.
pubsub2inbox - Pubsub2Inbox is a versatile, multi-purpose tool to handle Pub/Sub messages and turn them into email, API calls, GCS objects, files or almost anything.
f1-data-pipeline - F1 Data Pipeline
ethereum-etl-airflow - Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee
prefect-deployment-patterns - Code examples showing flow deployment to various types of infrastructure
forseti-security - Forseti Security
youtube_data_analysis - Created an optimised pipeline to provide accurate data for analysis, then used snowsight (provided by Snowflake) to create a dashboard.
Patek - A collection of reusable pyspark utility functions that help make development easier!
maternal-health-risk - Maternal Health Risk prediction MLOps pipeline
bigquery-utils - Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
gcp-flowlogs-reader - Command line tool and Python library for working with Google Cloud VPC Flow Logs