prefect-deployment-patterns
weather_data_pipeline
prefect-deployment-patterns | weather_data_pipeline | |
---|---|---|
1 | 1 | |
93 | 3 | |
- | - | |
0.0 | 4.2 | |
over 1 year ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
prefect-deployment-patterns
-
[D] Should I go with Prefect, Argo or Flyte for Model Training and ML workflow orchestration?
Have you used infrastructure blocks in Prefect? You could easily build a block for Sagemaker deploying infrastructure for the flow running with GPUs, then run other flow in a local process, yet another one as Kubernetes job, Docker container, ECS task, AWS batch, etc. Super easy to set up, even from the UI or from CI/CD. There are a bunch of templates and examples here: https://github.com/anna-geller/prefect-deployment-patterns
weather_data_pipeline
-
Building a Weather Data Pipeline with PySpark, Prefect, and Google Cloud
We'll be using PySpark for distributed data processing, Prefect for workflow management, and Google Cloud Storage and BigQuery for data storage and processing.The code is available on github.
What are some alternatives?
Taipy - Turns Data and AI algorithms into production-ready web applications in no time.
magic-the-gathering - A complete pipeline to pull data from Scryfall's "Magic: The Gathering"-API, via Prefect orchestration and dbt transformation.
Udacity-Data-Engineering-Projects - Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
f1-data-pipeline - F1 Data Pipeline
buildflow - BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest. No configuration outside of the code is required.
youtube_data_analysis - Created an optimised pipeline to provide accurate data for analysis, then used snowsight (provided by Snowflake) to create a dashboard.
canarypy - CanaryPy - A light and powerful canary release for Data Pipelines
dataproc-templates - Dataproc templates and pipelines for solving simple in-cloud data tasks
maternal-health-risk - Maternal Health Risk prediction MLOps pipeline
dataall - A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Prefect - The easiest way to build, run, and monitor data pipelines at scale.