astro
typhoon-orchestrator
astro | typhoon-orchestrator | |
---|---|---|
2 | 14 | |
183 | 29 | |
- | - | |
10.0 | 0.0 | |
over 1 year ago | over 1 year ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
astro
-
After Airflow. Where next for DE?
What I would suggest is if you want an "Airflow 3.0" feel you check out the Astro SDK. My team and I basically spent a year and a half rewriting the Airflow DAG writing experience from the ground up. Completely different feel, highly scalable SQL/python/spark (soon) workflows that basically feel like native python. Way easier to test as well. You can pass dataframes into SQL queries, load data from any supported source to any supported warehouse, and things like lineage are natively supported :)
typhoon-orchestrator
- After Airflow. Where next for DE?
- New OSS Orchestrator - Where should we go next?
-
Airflow's Problem
I have my own opinion on Airflow's pain points and created Typhoon Orchestrator (https://github.com/typhoon-data-org/typhoon-orchestrator) to solve them. It doesn't have many stars yet but I've used it to create some pipelines for medium sized companies in a few days, and they've been running for over a year without issues.
In particular I transpile to Airflow code (can also deploy to Lambda) because I think it's still the most robust and well supported "runtime", I just don't think the developer experience is that good.
-
Data Engineering for very small businesses. Any experiences?
Typhoon Orchestrator This is a framework that I designed to help fix some of the pain points of Airflow so that I could build test and deploy pipelines faster. You could skip this step but if you want more info check here.
-
CSV data library to database
I am also collaborating on an open source tool called Typhoon Orchestrator (repo). It aims to make composing airflow data pipelines simple and quite quick. Putting pipeline steps together like lego.
-
Recommendations for simple ETL (Postgres to Snowflake)
The project (https://github.com/typhoon-data-org/typhoon-orchestrator) doesn't have many stars yet but I have deployed it on a medium sized hotel chain for several data sources with a similar use case to yours and it's been working for over a year with no intervention. If you decide to pursue this option I'd be willing to provide provide some support free of charge (feel free to PM me).
-
Impress your friends! Make a serverless bot that sends daily jokes to a Telegram Group
Typhoon Orchestrator is a great way to deploy ETL workflow on AWS Lambda. In this tutorial we intend to show how easy to use and versatile it is by deploying code to Lambda that gets a random joke from https://jokeapi.dev once a day and sends it to your telegram group.
-
My Thirty Years of Dodging Repetitive Work with Automation Tools
I think there's space for an open source library that can help with what you described. We originally created https://github.com/typhoon-data-org/typhoon-orchestrator to orchestrate ETL workflows, which would be a superset of the use cases you described. Our next goal is to allow deployment to AWS lambda which can be a good compromise between getting locked in with SAAS and hosting your own infrastructure.
Also check out Zappa's scheduled tasks that have a similar goal and inspired our library.
- Airflow, you complete me! Compose YAML DAGs for Airflow with auto-complete with Typhoon (Open Source).
- Use Airflow? Composable elegant YAML DAGS that transpile to Airflow. Zero risk and no migration.
What are some alternatives?
astro-sdk - Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
JokeAPI - REST API that serves uniformly and well formatted jokes in JSON, XML, YAML or plain text format that also offers a great variety of filtering methods
airflow-maintenance-dags - A series of DAGs/Workflows to help maintain the operation of Airflow
Mage - 🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
getting-started - This repository is a getting started guide to Singer.
pachyderm - Data-Centric Pipelines and Data Versioning
sqlelf - Explore ELF objects through the power of SQL
f1-data-pipeline - F1 Data Pipeline
jmespath.py - JMESPath is a query language for JSON.