pre-commit
Airflow
pre-commit | Airflow | |
---|---|---|
192 | 169 | |
12,087 | 34,570 | |
1.7% | 1.4% | |
8.0 | 10.0 | |
5 days ago | 2 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pre-commit
-
How to setup Black and pre-commit in python for auto text-formatting on commit
Today we are going to look at how to setup Black (a python code formatter) and pre-commit (a package for handling git hooks in python) to automatically format you code on commit.
-
Implementing Quality Checks In Your Git Workflow With Hooks and pre-commit
# See https://pre-commit.com for more information # See https://pre-commit.com/hooks.html for more hooks repos: - repo: https://github.com/pre-commit/pre-commit-hooks rev: v3.2.0 hooks: - id: trailing-whitespace - id: end-of-file-fixer - id: check-yaml - id: check-toml - id: check-added-large-files - repo: local hooks: - id: tox lint name: tox-validation entry: pdm run tox -e test,lint language: system files: ^src\/.+py$|pyproject.toml|^tests\/.+py$ types_or: [python, toml] pass_filenames: false - id: tox docs name: tox-docs language: system entry: pdm run tox -e docs types_or: [python, rst, toml] files: ^src\/.+py$|pyproject.toml|^docs\/ pass_filenames: false - repo: https://github.com/pdm-project/pdm rev: 2.10.4 # a PDM release exposing the hook hooks: - id: pdm-lock-check - repo: https://github.com/jumanjihouse/pre-commit-hooks rev: 3.0.0 hooks: - id: markdownlint
-
Embracing Modern Python for Web Development
Pre-commit hooks act as the first line of defense in maintaining code quality, seamlessly integrating with linters and code formatters. They automatically execute these tools each time a developer tries to commit code to the repository, ensuring the code adheres to the project's standards. If the hooks detect issues, the commit is paused until the issues are resolved, guaranteeing that only code meeting quality standards makes it into the repository.
- EmacsConf Live Now
-
A Tale of Two Kitchens - Hypermodernizing Your Python Code Base
Pre-commit Hooks: Pre-commit is a tool that can be set up to enforce coding rules and standards before you commit your changes to your code repository. This ensures that you can't even check in (commit) code that doesn't meet your standards. This allows a code reviewer to focus on the architecture of a change while not wasting time with trivial style nitpicks.
-
Things I just don't like about Git
Ah, fair enough!
On my team we use pre-commit[0] a lot. I guess I would define the history to be something like "has this commit ever been run through our pre-commit hooks?". If you rewrite history, you'll (usually) produce commits that have not been through pre-commit (and they've therefore dodged a lot of static checks that might catch code that wasn't working, at that point in time). That gives some manner of objectivity to the "history", although it does depend on each user having their pre-commit hooks activated in their local workspace.
[0]: https://pre-commit.com/
-
Django Code Formatting and Linting Made Easy: A Step-by-Step Pre-commit Hook Tutorial
Pre-commit is a framework for managing and maintaining multi-language pre-commit hooks. It supports hooks for various programming languages. Using this framework, you only have to specify a list of hooks you want to run before every commit, and pre-commit handles the installation and execution of those hooks despite your project’s primary language.
-
Git: fu** the history!
You can learn more here: pre-commit.com
-
[Tool Anouncement] github-distributed-owners - A tool for managing GitHub CODEOWNERS using OWNERS files distributed throughout your code base. Especially helpful for monorepos / multi-team repos
Note this includes support for pre-commit.
-
Packaging Python projects in 2023 from scratch
As a nice next step, you could also add mypy to check your type hints are consistent, and automate running all this via pre-commit hooks set up with… pre-commit.
Airflow
-
Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions
Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.
-
Navigating Week Two: Insights and Experiences from My Tublian Internship Journey
In week Two, I contributed to the Apache Airflow repository.
-
Airflow VS quix-streams - a user suggested alternative
2 projects | 7 Dec 2023
-
Best ETL Tools And Why To Choose
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. The platform features a web-based user interface and a command-line interface for managing and triggering workflows.
-
Simplifying Data Transformation in Redshift: An Approach with DBT and Airflow
Airflow is the most widely used and well-known tool for orchestrating data workflows. It allows for efficient pipeline construction, scheduling, and monitoring.
-
Share Your favorite python related software!
AIRFLOW This is more of a library in my opinion, but Airflow has become an essential tool for scheduling in my work. All our ML training pipelines are ordered and scheduled with Airflow and it works seamlessly. The dashboard provided is also fantastic!
-
Ask HN: What is the correct way to deal with pipelines?
I agree there are many options in this space. Two others to consider:
- https://airflow.apache.org/
- https://github.com/spotify/luigi
There are also many Kubernetes based options out there. For the specific use case you specified, you might even consider a plain old Makefile and incrond if you expect these all to run on a single host and be triggered by a new file showing up in a directory…
- "Você veio protestar para ter acesso ao código fonte da urnas. O que é o código fonte?" "Não sei" 🤡
- Cómo construir tu propia data platform. From zero to hero.
-
Is it impossible to contribute to open source as a data engineer?
You can try and contribute some new connectors/operators for workflow managers like Airflow or Airbyte
What are some alternatives?
husky - Git hooks made easy 🐶 woof!
Kedro - Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
gitleaks - Protect and discover secrets using Gitleaks 🔑
dagster - An orchestration platform for the development, production, and observation of data assets.
ruff - An extremely fast Python linter and code formatter, written in Rust.
n8n - Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.
semgrep - Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.
luigi - Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Poetry - Python packaging and dependency management made easy
Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing
pre-commit-golang - Pre-commit hooks for Golang with support for monorepos, the ability to pass arguments and environment variables to all hooks, and the ability to invoke custom go tools.
Dask - Parallel computing with task scheduling