blogpost-airflow-hybrid
Udacity-Data-Engineering-Projects
blogpost-airflow-hybrid | Udacity-Data-Engineering-Projects | |
---|---|---|
3 | 5 | |
9 | 1,295 | |
- | - | |
0.0 | 0.0 | |
almost 2 years ago | over 1 year ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
blogpost-airflow-hybrid
-
Upgrading my AWS CDK stacks to AWS CDKv2
Once the CDK stacks were validated and clear for deployment, I encountered a "feature" when generating IAM policies and roles. It was whilst fixing a different CDK stack (Orchestrating Hybrid Workflows with Apache Airflow, GitHub repo blogpost-airflow-hybrid that I found that even though the stack had synthesised ok, during the deployment, I got the following error:
-
Orchestrating hybrid workflows using Amazon Managed Workflows for Apache Airflow (MWAA)
As always, you can find the code for this walk through in this GitHub repo, blogpost-airflow-hybrid
-
Contributing to the Apache Airflow project
My solution was to use Apache Airflow and create a new workflow to orchestrate this. I planned to create an ETL script, and ensure the script can take parameters, to maximise reuse and flexibility.
Udacity-Data-Engineering-Projects
- Pitanje za data engineering?
-
✨ 5 Free Resources to Learn Data Engineering 🚀
🔗 https://github.com/san089/Udacity-Data-Engineering-Projects
-
How can I become a big data engineer?
You can start with googling data engineering learning path to get a sense of what you need to know. If you are looking for simple projects to start with then you can look at this as well (https://github.com/san089/Udacity-Data-Engineering-Projects).
-
Beginner DE projects.
For practice, Data Modeling with Postgres and Udacity Data Engineering Projects as examples, and Data Engineering Project for Beginners - Batch edition for a guided tutorial.
- Data Pipeline Examples in Action
What are some alternatives?
amazon-ecs-agent - Amazon Elastic Container Service Agent
hydra - Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
data-engineering-zoomcamp - Free Data Engineering course!
data-engineering-book - Accumulated knowledge and experience in the field of Data Engineering
ask-astro - An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
pg-counter-metrics - PG Counter Metrics ( PGCM ) is a tool for publishing PostgreSQL performance data to CloudWatch. By publishing to CloudWatch, dashboards and alarming can be used on the collected data.
canarypy - CanaryPy - A light and powerful canary release for Data Pipelines
Data-Engineering-Projects - Personal Data Engineering Projects
prefect-deployment-patterns - Code examples showing flow deployment to various types of infrastructure
CloudCrack - [RELEASED] A CLI tool for large-scale password recovery operations using AWS
dlt - data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
StravaDataPipline - :arrows_counterclockwise: :running: EtLT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow