dagster-example-pipeline
canarypy
dagster-example-pipeline | canarypy | |
---|---|---|
1 | 2 | |
64 | 3 | |
- | - | |
0.0 | 7.3 | |
about 2 years ago | 10 months ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dagster-example-pipeline
-
Developing in Dagster
The associated code repo can be found here
canarypy
-
Ask HN: Open-Source Canary Release Tool – Seeking Your Feedback
I'm excited to introduce CanaryPy, a new open-source tool to make your data pipelines more robust by introducing new releases of your data pipelines minimising the impact of unanticipated issues. It has a plugin for Apache Airflow for now but more to come.
We'd love for you to check it out on GitHub: https://github.com/thcidale0808/canarypy
Your feedback and suggestions for improvement are precious to us. What features would you like to see? How's the usability? Would you have any thoughts on integration with your current tools?
Thank you in advance for your insights!
-
Introducing Canary Release Tool to integrate with Apache Airflow - Seeking Your Feedback!
We'd love for you to check it out on GitHub: https://github.com/thcidale0808/canarypy
What are some alternatives?
mlrun - MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
Udacity-Data-Engineering-Projects - Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
Apache Superset - Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]
pyStudio - The easier way to do machine learning in Python without coding!
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
prefect-deployment-patterns - Code examples showing flow deployment to various types of infrastructure
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
dagster - An orchestration platform for the development, production, and observation of data assets.
portable-data-stack-dagster - A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB, PostgreSQL and Superset
aws-data-wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL). [Moved to: https://github.com/aws/aws-sdk-pandas]
youtube_data_analysis - Created an optimised pipeline to provide accurate data for analysis, then used snowsight (provided by Snowflake) to create a dashboard.