Top 23 Python Workflow Projects
-
So apache-airflow-backport-providers-amazon does have support for ec2 but only limited to start using EC2StartInstanceOperator and stop using EC2StopInstanceOperator, given the instance_id is known. It is missing create and terminate functionality.
-
-
SonarLint
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
-
There are specialized tools like DataHub (see this for columnar level reporting: https://feature-requests.datahubproject.io/roadmap/541 ) that would help. But really, in a good data platform, the orchestration layer should be aggregating metadata and giving you everything you need to trace lineage, A tool like Dagster does this well if you make full use of the Software Defined Assets capability, but that is fairly new so not so many people have embraced it yet.
-
Project mention: [D] Kubernetes for ML - how are y'all doing it? | reddit.com/r/MachineLearning | 2022-04-14
We use Polyaxon and it’s pretty good
-
Prerequisite XKCD, and then pdm
-
Project mention: Analyze and plot 5.5M records in 20s with BigQuery and Ploomber | dev.to | 2022-08-08
Since our analysis comprises SQL and Python, we use Ploomber, an open-source framework to write maintainable pipelines. It abstracts all the details, so we focus on writing the SQL and Python scripts.
-
flyte
Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale. It has been battle-tested at Lyft, Spotify, Freenome, and others and is truly open-source.
Some of these were the core problems that we wanted to address as part of https://flyte.org. We started with a team first and multi-tenant approach at the core. For example, each team can have separate IAM roles, secrets are restricted to teams, tasks and workflows are shareable across teams, without making libraries. and it is possible to trigger workflows across teams.
-
Scout APM
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
-
We use https://github.com/octodns/octodns for some of our DNS records. It's flexible, much faster than Terraform for thousands of records, and the maintainer Ross has been responsive on issues and pull requests. Also see Cloudflare's blog for how they use it
-
Project mention: How to create a django ViewFlow process programmatically | reddit.com/r/codehunter | 2022-04-10
I'm developing a web application to learn Django (python 3.4 & Django 1.6.10). The web app has complex and often updated workflows. I decided to integrate the Django-Viewflow library (https://github.com/viewflow/viewflow/) as it seems to be a very convenient way to handle workflows and not incorporate the workflow logic with the application models.
-
I found these two https://pydoit.org/ and https://snakemake.readthedocs.io/en/stable/ but I am still looking for alternatives. Is there any more?
-
-
Project mention: A user has opened an attachment in a phishing email (MIME file, possibly .pdf). Our antivirus isn't finding anything, and there are no clear indications of compromise. We don't have a clear policy to respond to things like this. What would you do? | reddit.com/r/sysadmin | 2021-12-20
I haven't played with this yet, but it looks promising for trying to automate your OSINT when researching phishing emails: https://github.com/TheresAFewConors/Sooty
-
WALKOFF
A flexible, easy to use, automation framework allowing users to integrate their capabilities and devices to cut through the repetitive, tedious tasks slowing them down. #nsacyber
Project mention: Current college student here. What is it like to work for defense contractors? | reddit.com/r/cscareerquestions | 2021-11-10As for quirks, the biggest quirk is that you usually need to get a security clearance, and that means no drugs. As far as the tech goes, depends on what company you're working for and what government product they produce. If it's software for an otherwise physical product like a missile or an AGV, then it's probably gonna be some old stable language like C, with something like Java being used on the server side to talk to the machine. Meanwhile, there's definitely Python work sprinkled all throughout everything, and there's certainly parts of the government working on Docker or Kubernetes stuff. Like here's a completely unclassified government project that I've contributed to. It uses Docker and Yaml to automate tasks.
-
-
toil
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
Project mention: Lessons Learned from Running Apache Airflow at Scale | news.ycombinator.com | 2022-05-23 -
-
obsei
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
Project mention: Resources for social listening (preferably but not limited to Spanish) | reddit.com/r/LanguageTechnology | 2022-03-12 -
orchestra
Orchestra is a Robotic Process Automation system for orchestrating project teams of experts and machines. (by b12io)
Project mention: 3% of 666 Python codebases we checked had a silently failing unit test | reddit.com/r/Python | 2022-02-15https://github.com/ansible-community/ara/pull/358 https://github.com/b12io/orchestra/pull/830 https://github.com/batiste/django-page-cms/pull/210 https://github.com/carpentries/amy/pull/2130 https://github.com/celery/django-celery/pull/612 https://github.com/django-cms/django-cms/pull/7241 https://github.com/django-oscar/django-oscar/pull/3867 https://github.com/esrg-knights/Squire/pull/253https://github.com/Frojd/django-react-templatetags/pull/64 https://github.com/groveco/django-sql-explorer/pull/474 https://github.com/jazzband/django-silk/pull/550 https://github.com/keras-team/keras/pull/16073 https://github.com/ministryofjustice/cla_backend/pull/773 https://github.com/nitely/Spirit/pull/306 https://github.com/python/pythondotorg/pull/1987 https://github.com/rapidpro/rapidpro/pull/1610 https://github.com/ray-project/ray/pull/22396 https://github.com/saltstack/salt/pull/61647 https://github.com/Swiss-Polar-Institute/project-application/pull/483 https://github.com/UEWBot/dipvis/pull/216
-
ck
This toolkit helps you to add simple, human-readable and platform-independent CLI, API and metadata to ad-hoc MLOps and DevOps scripts and artifacts to make them more portable, reusable, interoperable, customizable and deterministic across continuously changing software & hardware. It is used to modularize ML Systems and AI. (by mlcommons)
-
Project mention: GitHub - luispedro/jug: Parallel programming with Python | reddit.com/r/programming | 2021-12-22
-
I also have this problem, so I made a tool that lets you write your commit messages in-advance. It helps me to focus on one problem at a time.
One feature I wanted to add was for it to parse your source code for comments with a specific format (e.g. `# git-plan feat xyz` or `# git-plan fix xyz`) and then stitch all the hunks together into commits for you. So all you'd have to do is comment your code and then run `git plan commit` and it would generate commits for you to confirm with y/n.
https://github.com/synek/git-plan
(I haven't worked on it for a while though)
-
reviews
A terminal UI dashboard to monitor requests for code review across Github and Gitlab repositories.
Project mention: I created a CLI tool to show you a dashboard of Pull Requests you care about | reddit.com/r/github | 2021-11-23This is awesome; I build something to solve the same problem recently in Python (https://github.com/apoclyps/reviews); I was debating rewriting it in go with Bubble tea so it's awesome to discover this project!
-
Project mention: Is it ok to merge few applications into one ? How to do it ? | reddit.com/r/linux4noobs | 2021-10-11
For example you could use pulsectl and cadence-python for bindings
Python Workflow related posts
- Analyze and plot 5.5M records in 20s with BigQuery and Ploomber
- Show HN: Debuglater – Serialize Python traceback for later debugging
- debuglater: Serialize Python traceback for later debugging
- I've been really frustrated with picking the right tools for bulk RNA-seq, so I did a long literature review and wrote this workflow
- Airflow's Problem
- Field Lineage
- Tips and Tricks to Use Jupyter Notebooks Effectively
Index
What are some of the best open-source Workflow projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Airflow | 26,899 |
2 | Prefect | 9,712 |
3 | dagster | 5,110 |
4 | polyaxon | 3,134 |
5 | PDM | 2,902 |
6 | ploomber | 2,599 |
7 | flyte | 2,545 |
8 | octoDNS | 2,401 |
9 | viewflow | 2,227 |
10 | doit | 1,423 |
11 | Spiff | 1,227 |
12 | Sooty | 1,092 |
13 | WALKOFF | 1,071 |
14 | galaxy | 984 |
15 | toil | 813 |
16 | mlrun | 769 |
17 | obsei | 699 |
18 | orchestra | 645 |
19 | ck | 473 |
20 | jug | 381 |
21 | git-plan | 170 |
22 | reviews | 148 |
23 | cadence-python | 137 |
Are you hiring? Post a new remote job listing for free.