redun
huey
Our great sponsors
redun | huey | |
---|---|---|
4 | 10 | |
486 | 4,890 | |
2.1% | - | |
7.5 | 6.6 | |
about 2 months ago | 26 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
redun
- Redun: Yet another redundant workflow engine
-
Nextflow: Data-Driven Computational Pipelines
I'm personally a huge fan of redun¹ for running computational pipelines. It's pure python, it's easy to learn/debug, it has automatic caching, retry, provenance logging, and a great integration with AWS Batch for running large jobs. I've been really impressed with how easy it is to run a job to completion that fans out to thousands of AWS spot instances at once.
I've used nextflow in the past, and I've found it to be much harder to use. Learning another DSL is annoying, documentation was sparse, I constantly ran into bugs, and it was hard to debug in general. I don't know how much it's changed over the past 3 years though.
¹https://github.com/insitro/redun
- Insitro's redun: Yet another redundant workflow engine
- Insitro's new open source software uses DAGs.
huey
-
Nextflow: Data-Driven Computational Pipelines
I've considered using Nextflow for bioinformatics pipelines but have yet to take the plunge. At work, I develop a proteomics pipeline that is composed of huey¹ tasks (Python library; simple alternative to Celery) which either use subprocess to call out to some external tool, or are just pure python. It runs in a worker container which is created by docker swarm, and all containers pull jobs from redis. For our scale, it works great. However, I don't have control over the resource utilization of individual steps, and in the past I've had issues with the pipeline blocking as a result of how I was chaining tasks together. I think something like Nextflow would remove these limitations, but one thing I think I would miss is the ability to debug individual pipeline steps locally with an interactive debugger. As far as I can tell, Nextflow has logging/tracing facilities but nothing quite like an interactive debugger. I'd be happy to be told I'm wrong, or even that I'm doing it wrong.
____
¹ https://github.com/coleifer/huey/
-
Background jobs with Django
Other options are DjangoQ and Huey, which tend to work ok. Of the two I prefer DjangoQ. Database backed, don't require the Redis/Celery rigmarole.
-
What's the best thing you've learned about Django this year?
Funny, just this moment i finally switched from Celery to huey. And so far I don't regret. huey looks very promising, has good documentation and is well integrated into DJango. You should give it a try: https://github.com/coleifer/huey
-
This Week in Python
huey – a little task queue for python
-
What is your favourite task queuing framework?
Huey -> Same again?
-
5 background scheduling libraries in Python you must know
Huey: https://github.com/coleifer/huey
- Celery in production: Three more years of fixing bugs
-
Not sure if I should use celery or asyncio
I just want to add that a couple celery alternatives worth looking at include huey and dramatiq.
-
What is the best option for a (Python 3) task queue on Windows now that Celery 4 has dropped Windows support?
huey
-
Django 4.0 released
same, I ran into an issue cos of django-background-tasks. I am thinking to replace it with huey
What are some alternatives?
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
celery - Distributed Task Queue (development branch)
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
rq - Simple job queues for Python
nextflow - A DSL for data-driven computational pipelines
dramatiq - A fast and reliable background task processing library for Python 3.
luigi - Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
RabbitMQ - Open source RabbitMQ: core server and tier 1 (built-in) plugins
cgpipe - cgpipe - minimum viable HPC pipeline
mrq - Mr. Queue - A distributed worker task queue in Python using Redis & gevent
common-workflow-language - Repository for the CWL standards. Use https://cwl.discourse.group/ for support 😊
KQ - Kafka-based Job Queue for Python