Top 23 Python Workflow Projects
Apache Airflow - A platform to programmatically author, schedule, and monitor workflowsProject mention: Airflow Plugin - How I wrote custom Airflow Plugins | dev.to | 2022-08-01
So apache-airflow-backport-providers-amazon does have support for ec2 but only limited to start using EC2StartInstanceOperator and stop using EC2StopInstanceOperator, given the instance_id is known. It is missing create and terminate functionality.
The easiest way to build, run, and monitor data pipelines at scale.Project mention: Prefect - The easiest way to automate your data | reddit.com/r/github | 2022-05-21
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
An orchestration platform for the development, production, and observation of data assets.Project mention: Field Lineage | reddit.com/r/dataengineering | 2022-08-02
There are specialized tools like DataHub (see this for columnar level reporting: https://feature-requests.datahubproject.io/roadmap/541 ) that would help. But really, in a good data platform, the orchestration layer should be aggregating metadata and giving you everything you need to trace lineage, A tool like Dagster does this well if you make full use of the Software Defined Assets capability, but that is fairly new so not so many people have embraced it yet.
MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycleProject mention: [D] Kubernetes for ML - how are y'all doing it? | reddit.com/r/MachineLearning | 2022-04-14
We use Polyaxon and it’s pretty good
A modern Python package and dependency manager supporting the latest PEP standardsProject mention: What do you hate most about Python? | reddit.com/r/Python | 2022-07-25
Prerequisite XKCD, and then pdm
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️Project mention: Analyze and plot 5.5M records in 20s with BigQuery and Ploomber | dev.to | 2022-08-08
Since our analysis comprises SQL and Python, we use Ploomber, an open-source framework to write maintainable pipelines. It abstracts all the details, so we focus on writing the SQL and Python scripts.
Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale. It has been battle-tested at Lyft, Spotify, Freenome, and others and is truly open-source.Project mention: Airflow's Problem | news.ycombinator.com | 2022-08-02
Some of these were the core problems that we wanted to address as part of https://flyte.org. We started with a team first and multi-tenant approach at the core. For example, each team can have separate IAM roles, secrets are restricted to teams, tasks and workflows are shareable across teams, without making libraries. and it is possible to trigger workflows across teams.
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
Tools for managing DNS across multiple providersProject mention: The Dhall Configuration Language | news.ycombinator.com | 2022-07-14
We use https://github.com/octodns/octodns for some of our DNS records. It's flexible, much faster than Terraform for thousands of records, and the maintainer Ross has been responsive on issues and pull requests. Also see Cloudflare's blog for how they use it
Reusable workflow library for DjangoProject mention: How to create a django ViewFlow process programmatically | reddit.com/r/codehunter | 2022-04-10
I'm developing a web application to learn Django (python 3.4 & Django 1.6.10). The web app has complex and often updated workflows. I decided to integrate the Django-Viewflow library (https://github.com/viewflow/viewflow/) as it seems to be a very convenient way to handle workflows and not incorporate the workflow logic with the application models.
task management & automation toolProject mention: What is a good alternative to Makefile? | reddit.com/r/AskProgramming | 2022-07-07
I found these two https://pydoit.org/ and https://snakemake.readthedocs.io/en/stable/ but I am still looking for alternatives. Is there any more?
A powerful workflow engine implemented in pure Python
The SOC Analysts all-in-one CLI tool to automate and speed up workflow.Project mention: A user has opened an attachment in a phishing email (MIME file, possibly .pdf). Our antivirus isn't finding anything, and there are no clear indications of compromise. We don't have a clear policy to respond to things like this. What would you do? | reddit.com/r/sysadmin | 2021-12-20
I haven't played with this yet, but it looks promising for trying to automate your OSINT when researching phishing emails: https://github.com/TheresAFewConors/Sooty
A flexible, easy to use, automation framework allowing users to integrate their capabilities and devices to cut through the repetitive, tedious tasks slowing them down. #nsacyberProject mention: Current college student here. What is it like to work for defense contractors? | reddit.com/r/cscareerquestions | 2021-11-10
As for quirks, the biggest quirk is that you usually need to get a security clearance, and that means no drugs. As far as the tech goes, depends on what company you're working for and what government product they produce. If it's software for an otherwise physical product like a missile or an AGV, then it's probably gonna be some old stable language like C, with something like Java being used on the server side to talk to the machine. Meanwhile, there's definitely Python work sprinkled all throughout everything, and there's certainly parts of the government working on Docker or Kubernetes stuff. Like here's a completely unclassified government project that I've contributed to. It uses Docker and Yaml to automate tasks.
Data intensive science for everyone. (by galaxyproject)
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.Project mention: Lessons Learned from Running Apache Airflow at Scale | news.ycombinator.com | 2022-05-23
Machine Learning automation and trackingProject mention: Discussion on Need of Feature Stores | reddit.com/r/mlops | 2022-07-17
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .Project mention: Resources for social listening (preferably but not limited to Spanish) | reddit.com/r/LanguageTechnology | 2022-03-12
Orchestra is a Robotic Process Automation system for orchestrating project teams of experts and machines. (by b12io)Project mention: 3% of 666 Python codebases we checked had a silently failing unit test | reddit.com/r/Python | 2022-02-15
https://github.com/ansible-community/ara/pull/358 https://github.com/b12io/orchestra/pull/830 https://github.com/batiste/django-page-cms/pull/210 https://github.com/carpentries/amy/pull/2130 https://github.com/celery/django-celery/pull/612 https://github.com/django-cms/django-cms/pull/7241 https://github.com/django-oscar/django-oscar/pull/3867 https://github.com/esrg-knights/Squire/pull/253https://github.com/Frojd/django-react-templatetags/pull/64 https://github.com/groveco/django-sql-explorer/pull/474 https://github.com/jazzband/django-silk/pull/550 https://github.com/keras-team/keras/pull/16073 https://github.com/ministryofjustice/cla_backend/pull/773 https://github.com/nitely/Spirit/pull/306 https://github.com/python/pythondotorg/pull/1987 https://github.com/rapidpro/rapidpro/pull/1610 https://github.com/ray-project/ray/pull/22396 https://github.com/saltstack/salt/pull/61647 https://github.com/Swiss-Polar-Institute/project-application/pull/483 https://github.com/UEWBot/dipvis/pull/216
This toolkit helps you to add simple, human-readable and platform-independent CLI, API and metadata to ad-hoc MLOps and DevOps scripts and artifacts to make them more portable, reusable, interoperable, customizable and deterministic across continuously changing software & hardware. It is used to modularize ML Systems and AI. (by mlcommons)
Parallel programming with PythonProject mention: GitHub - luispedro/jug: Parallel programming with Python | reddit.com/r/programming | 2021-12-22
Git Plan - a better workflow for gitProject mention: How to Write a Great Git Commit Message | news.ycombinator.com | 2022-04-26
I also have this problem, so I made a tool that lets you write your commit messages in-advance. It helps me to focus on one problem at a time.
One feature I wanted to add was for it to parse your source code for comments with a specific format (e.g. `# git-plan feat xyz` or `# git-plan fix xyz`) and then stitch all the hunks together into commits for you. So all you'd have to do is comment your code and then run `git plan commit` and it would generate commits for you to confirm with y/n.
(I haven't worked on it for a while though)
A terminal UI dashboard to monitor requests for code review across Github and Gitlab repositories.Project mention: I created a CLI tool to show you a dashboard of Pull Requests you care about | reddit.com/r/github | 2021-11-23
This is awesome; I build something to solve the same problem recently in Python (https://github.com/apoclyps/reviews); I was debating rewriting it in go with Bubble tea so it's awesome to discover this project!
Python framework for Cadence Workflow ServiceProject mention: Is it ok to merge few applications into one ? How to do it ? | reddit.com/r/linux4noobs | 2021-10-11
For example you could use pulsectl and cadence-python for bindings
Python Workflow related posts
Analyze and plot 5.5M records in 20s with BigQuery and Ploomber
2 projects | dev.to | 8 Aug 2022
Show HN: Debuglater – Serialize Python traceback for later debugging
3 projects | news.ycombinator.com | 8 Aug 2022
debuglater: Serialize Python traceback for later debugging
2 projects | reddit.com/r/Python | 8 Aug 2022
I've been really frustrated with picking the right tools for bulk RNA-seq, so I did a long literature review and wrote this workflow
3 projects | reddit.com/r/bioinformatics | 4 Aug 2022
6 projects | news.ycombinator.com | 2 Aug 2022
4 projects | reddit.com/r/dataengineering | 2 Aug 2022
Tips and Tricks to Use Jupyter Notebooks Effectively
3 projects | dev.to | 1 Aug 2022
What are some of the best open-source Workflow projects in Python? This list will help you:
Are you hiring? Post a new remote job listing for free.