dataproc-templates
Dataproc templates and pipelines for solving in-cloud data tasks (by GoogleCloudPlatform)
weather_data_pipeline
This is a PySpark-based data pipeline that fetches weather data for a few cities, performs some basic processing and transformation on the data, and then writes the processed data to a Google Cloud Storage bucket and a BigQuery table.The data is then viewed in a looker dashboard (by 24jmwangi)
| dataproc-templates | weather_data_pipeline | |
|---|---|---|
| 1 | 1 | |
| 153 | 6 | |
| - | - | |
| 6.9 | 4.2 | |
| about 1 month ago | about 3 years ago | |
| Python | Python | |
| Apache License 2.0 | - |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dataproc-templates
Posts with mentions or reviews of dataproc-templates.
We have used some of these posts to build our list of alternatives
and similar projects.
weather_data_pipeline
Posts with mentions or reviews of weather_data_pipeline.
We have used some of these posts to build our list of alternatives
and similar projects.
-
Building a Weather Data Pipeline with PySpark, Prefect, and Google Cloud
We'll be using PySpark for distributed data processing, Prefect for workflow management, and Google Cloud Storage and BigQuery for data storage and processing.The code is available on github.
What are some alternatives?
When comparing dataproc-templates and weather_data_pipeline you can also consider the following projects:
pubsub2inbox - Pubsub2Inbox is a versatile, multi-purpose tool to handle Pub/Sub messages and turn them into email, API calls, GCS objects, files or almost anything.
prefect-deployment-patterns - Code examples showing flow deployment to various types of infrastructure
gcp-flowlogs-reader - Command line tool and Python library for working with Google Cloud VPC Flow Logs
maternal-health-risk - Maternal Health Risk prediction MLOps pipeline
ethereum-etl-airflow - Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee
f1-data-pipeline - F1 Data Pipeline