dataall
flowrunner
dataall | flowrunner | |
---|---|---|
1 | 3 | |
213 | 8 | |
3.8% | - | |
9.4 | 5.9 | |
4 days ago | 11 months ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dataall
-
Newsletter martinmueller.dev 2022 week 19
And again a nice weekly summary. My highlight is the aws-dataall as it looks super interesting for sharing ML data company internally and event externally mhh. Lots to think about here. But as always there is tons of other gold so you have to explore that!
flowrunner
-
Python Package to build ETL flows/dags
Repo link: https://github.com/prithvijitguha/flowrunner
What are some alternatives?
projen - Rapidly build modern applications with advanced configuration management
great_expectations - Always know what to expect from your data.
prefect-deployment-patterns - Code examples showing flow deployment to various types of infrastructure
Apache Superset - Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]
covid-19-data-engineering-pipeline - A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
patterns-devkit - Data pipelines from re-usable components
senjuns - Monorepo for wiki, landingpage, AWS CDK code and more for Senjuns. Senjuns is (will be) a freelancer platform for connecting seniors and juniors with clients.
hamilton - A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
MyVoteAWS - beginner aws project to learn how various components work - build voting app
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
hamilton - Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.