dagster-example-pipeline
Prefect
dagster-example-pipeline | Prefect | |
---|---|---|
1 | 19 | |
64 | 14,645 | |
- | 1.8% | |
0.0 | 10.0 | |
about 2 years ago | 7 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dagster-example-pipeline
-
Developing in Dagster
The associated code repo can be found here
Prefect
- Prefect: A workflow orchestration tool for data pipelines
- self hosted Alternative to easycron.com?
-
Example typescript project repos?
If I was answering this question but for python, I'd recommend something like prefect, boto3, or tortoise-orm -- not extremely complex and with a pretty comprehensible featureset.
-
I have developed a simple Task Orchestrator
However, if you are looking for something like this, but much more mature and something of a bloat to be frank, there's Prefect. Honestly, woflo borrows a lot from Prefect conceptually.
-
Dabbling with Dagster vs. Airflow
Disclaimer: I work for Prefect.
It looks like we added cron and other schedule types to the deployment CLI just under a month ago[1].
Over the last couple of releases, we've also made it easier to pull deployments from GitHub or bake your flow code into Docker images instead of needing S3-like storage.
As with any product, there's always more to do, so I appreciate you sharing your thoughts. More than anywhere else I've worked, community feedback is a huge driver of product enhancements and feature development. Feel free to join our Slack community[2] if you'd like to share more feedback or ask questions.
[1] https://github.com/PrefectHQ/prefect/blob/main/RELEASE-NOTES...
- Prefect - The easiest way to automate your data
- Ask HN: Codebases with great, easy to read code?
-
Prefect CLI Action
GitHub Action for running Prefect commands using the Prefect CLI.
- Perfect – Data workflow automation with Python
What are some alternatives?
mlrun - MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
dagster - An orchestration platform for the development, production, and observation of data assets.
Apache Superset - Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]
APScheduler - Task scheduling library for Python
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
canarypy - CanaryPy - A light and powerful canary release for Data Pipelines
schedule - Python job scheduling for humans.
portable-data-stack-dagster - A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB, PostgreSQL and Superset
doit - task management & automation tool
aws-data-wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL). [Moved to: https://github.com/aws/aws-sdk-pandas]
django-schedule - A calendaring app for Django. It is now stable, Please feel free to use it now. Active development has been taken over by bartekgorny.