airflow-docker
trino-getting-started
airflow-docker | trino-getting-started | |
---|---|---|
2 | 2 | |
223 | 228 | |
- | - | |
3.0 | 5.1 | |
2 months ago | 18 days ago | |
Python | Python | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
airflow-docker
-
ETL with python
You can watch my Apache Airflow for Beginner Tutorial Series playlist on YouTube. If you think it is helpful, consider subscribing to my youtube channel and star my GitHub repository. Comment what topics you want to see or discuss about Airflow in the next episode.
-
Apache Airflow for Beginners Tutorial Series
If you are interested, you can watch the whole playlist on YouTube. If you think it is helpful, consider subscribing to my youtube channel and star my GitHub repository.
trino-getting-started
-
Trying Delta Lake at home
https://github.com/bitsondatadev/trino-getting-started/tree/main/delta-lake => Trino (Presto "equivalent") + delta lake format + Minio (s3 equivalent)
-
(Almost) OpenSource data stack for a personal DE project. Before jumping on the project I would have liked to have some advice on things to fix or improve in this structure! do you think that this stack could work?
Here’s a small deployment with MinIO to play with: https://github.com/bitsondatadev/trino-getting-started/tree/main/hive/trino-minio
What are some alternatives?
ansible-docker - Install / Configure Docker and Docker Compose using Ansible.
dbt-spark - dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
elyra - Elyra extends JupyterLab with an AI centric approach.
delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
sqlglot - Python SQL Parser and Transpiler
netbox-docker - 🐳 Docker Image of NetBox
docker-spark-deltalake - Docker image for running SparkSQL Thrift server
docker-autocompose - Generate a docker-compose yaml definition from a running container
fastapi-realworld-example-app - Backend logic implementation for https://github.com/gothinkster/realworld with awesome FastAPI
airflow-docker - This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and workflows.
delta-docs - Delta Lake Documentation