airflow-docker
cargo-crates
airflow-docker | cargo-crates | |
---|---|---|
1 | 3 | |
21 | 1 | |
- | - | |
7.0 | 3.1 | |
3 months ago | 20 days ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
airflow-docker
-
Airflow Api tests
Clone the airflow-docker repo.
cargo-crates
-
Docker - Magic or Hype?
I've used this benefit in one of my personal side projects (cargo-crates) to have ready-made containers for data extraction purposes. I'm always picking up projects and putting them back down, or shifting which versions of different libraries I have on my laptop, so picking up an old project with specific library dependencies can be really annoying.
-
Your default tool for ETL
I went a little crazy and built my own set of data extractors that I can deploy with CDK to ECS.
-
Why is it so hard to think of a DE side project idea ?
- Extract data from system. I wear an Oura ring for sleep tracking. I wanted to do my own analysis of the data, so I built a system that could easily allow me to extract the data into S3 so I could query it. https://github.com/dacort/cargo-crates Will anybody find that useful? Maybe...but it's been a heck of a lot of fun and really pushed my Docker skills.
What are some alternatives?
soda-sql - Data profiling, testing, and monitoring for SQL accessible data.
dbt-spark - dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
wsl-windows-toolbar-launcher - Adds linux GUI application menu to a windows toolbar
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
superset - Apache Superset is a Data Visualization and Data Exploration Platform
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
nft-starter-kit - Timescale NFT Starter Kit
Apache Superset - Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]
airflow-api-tests - This is a collection of Pytest for the 2.0 Stable Rest Apis for Apache Airflow. I have another repo where you could setup airflow locally and play around with these. I am used to RestAssured, but trying out pytest here.
portable-data-stack-dagster - A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB, PostgreSQL and Superset
airflow-docker - Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)
StravaDataPipline - :arrows_counterclockwise: :running: EtLT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow