catalog
docker-airflow
catalog | docker-airflow | |
---|---|---|
4 | 10 | |
640 | 3,703 | |
1.7% | - | |
7.5 | 0.0 | |
7 days ago | about 1 year ago | |
Shell | Shell | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
catalog
-
Tektron clone from gitea over ssh
May be this https://github.com/tektoncd/catalog/issues/1052 bug…
-
How to reuse steps in Tekton tasks
The most obvious way would be copying and pasting the steps, but for more complex scenarios, where there are n tasks, this becomes error prone. Using a template engine like helm could help, but learning another templating engine, plus having to change the contents of said tasks also becomes a burden. Instead, kustomize has a set of tools to make this job easier, while enjoying reutilizing tasks from the tektoncd/catalog.
-
Cloud Native CI/CD with Tekton - Building Custom Tasks
Another common thing that you might need in your Tasks is some kind of a storage where you can write data that can be used by subsequent steps in the Task or by other Tasks in the pipeline. The most common use case for this would be a place to fetch git repo. This kind of a storage is called workspace in Tekton and the following example shows a Tasks that mounts and clears the storage using rmdir:
- Problems with CI/CD on kubernetes
docker-airflow
-
Kubernetes deployment read-only filesystem error
I am facing an error while deploying Airflow on Kubernetes (precisely this version of Airflow https://github.com/puckel/docker-airflow/blob/1.8.1/Dockerfile) regarding writing permissions onto the filesystem.
-
How to use virtual environment in airflow DAGS?
I used https://github.com/puckel/docker-airflow to setup the airflow and I moved my python scripts inside the dags directory but now they won't execute because I can't access the installed libraries in the virtual environment. How can i find a workaround for this?
-
Amount of effort to stand up, integrate and manage a small airflow implementation
Used a custom version of Puckel Airflow Docker image (Spent a lot of time customising to our needs, but default Airflow container should still work)
-
The Unbundling of Airflow
I understand it is subjective. But I use a forked version of https://github.com/puckel/docker-airflow on our managed K8s cluster and it points to a cloud managed Postgres. It has worked pretty well for over 3 years with no-one actually managing it from an infra POV. YMMV. This is driving a product whose ARR is well in the 100s of Millions.
If you have simple needs that are more or less set, I agree Airflow is overkill and a simple Jenkins instance is all you need.
-
Airflow v1 to v2 - Recommendations / RoX
So were running Airflow v1 (based on this docker compose) with a sequential executor running on an on prem OpenShift v3 setup. We have a new / free resource coming and have planned to use them to reinitiate a complete new version utilizing OpenShift v4 (also on prem but not managed by us) and upgrade in parallel to Airflow v2. The question is if anyone has any strong recommendations on a good docker compose file they would look at and any views on celery / kubernets workers. We're not a huge team but have a bit of experience up our sleeves now so was more after some guidance or thoughts if others have gone down similar paths. Thanks!
-
Can someone help me understand the difference between the the docker-compose files?
version: '3' services: postgres: image: postgres:9.6 environment: - POSTGRES_USER=airflow - POSTGRES_PASSWORD=airflow - POSTGRES_DB=airflow ports: - "5432:5432" webserver: image: puckel/docker-airflow:1.10.1 build: context: https://github.com/puckel/docker-airflow.git#1.10.1 dockerfile: Dockerfile args: AIRFLOW_DEPS: gcp_api,s3 PYTHON_DEPS: sqlalchemy==1.2.0 restart: always depends_on: - postgres environment: - LOAD_EX=n - EXECUTOR=Local - FERNET_KEY=jsDPRErfv8Z_eVTnGfF8ywd19j4pyqE3NpdUBA_oRTo= volumes: - ./examples/intro-example/dags:/usr/local/airflow/dags # Uncomment to include custom plugins # - ./plugins:/usr/local/airflow/plugins ports: - "8080:8080" command: webserver healthcheck: test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"] interval: 30s timeout: 30s retries: 3
-
How should I get started with CI/CD ? (new to data engineering)
As for learning, learn how to build and use docker containers. For airflow, take a look a https://github.com/puckel/docker-airflow and see how to add you pipelines to that container. Then learn how to do CI/CD for docker containers (tons of tutorials). Then learn to deploy containers, you can use aws ecs.
-
Interview - take home project on data ingestion, warehouse design, basic analytics and conceptual using python and sql.
Usually googling the software you want + docker will get you what you need. For that particular project, I used https://github.com/puckel/docker-airflow to help set up a local airflow instance.
-
ETL com Apache Airflow, Web Scraping, AWS S3, Apache Spark e Redshift | Parte 1
A imagem do docker utilizada foi a puckel/docker-airflow onde acrescentei o BeautifulSoup como dependência para criação da imagem em minha máquina.
-
How we evolved our data engineering workflow day by day
We used to schedule and monitor workflows tool airflow as our ELT processor and have to extract data from SQL and No-SQL databases to load them into the warehouse. Our airflow deployment was done through docker, for more details checkout puckel/airflow. Currently, we are adopting our image to the official docker images.
What are some alternatives?
catalog - An Open Source PHP + MySQL application to manage your home library
orchest - Build data pipelines, the easy way 🛠️
infrastructure - Kubernetes infrastructure deployed by Terraform
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
tekton-kickstarter - Templates, scripts and samples for quickly building CI/CD with Tekton.
wordpress-docker-compose - Easy Wordpress development with Docker and Docker Compose
community - Community documentation for the Tekton project
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
reviewdog - 🐶 Automated code review tool integrated with any code analysis tools regardless of programming language
beginner_de_project - Beginner data engineering project - batch edition
tekton-tasks-kustomize - Customizing Tekton tasks with kustomize
movie_review_pipeline_airflow - Este é um projeto de estudo que visa realizar a implementação de um processo ETL utilizando Airflow, AWS S3, Web Scraping, Apache Spark e Redshift.