Our great sponsors
huey | nifi | |
---|---|---|
10 | 35 | |
4,870 | 4,369 | |
- | 2.9% | |
6.6 | 9.9 | |
14 days ago | 4 days ago | |
Python | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
huey
-
Nextflow: Data-Driven Computational Pipelines
I've considered using Nextflow for bioinformatics pipelines but have yet to take the plunge. At work, I develop a proteomics pipeline that is composed of huey¹ tasks (Python library; simple alternative to Celery) which either use subprocess to call out to some external tool, or are just pure python. It runs in a worker container which is created by docker swarm, and all containers pull jobs from redis. For our scale, it works great. However, I don't have control over the resource utilization of individual steps, and in the past I've had issues with the pipeline blocking as a result of how I was chaining tasks together. I think something like Nextflow would remove these limitations, but one thing I think I would miss is the ability to debug individual pipeline steps locally with an interactive debugger. As far as I can tell, Nextflow has logging/tracing facilities but nothing quite like an interactive debugger. I'd be happy to be told I'm wrong, or even that I'm doing it wrong.
____
-
Background jobs with Django
Other options are DjangoQ and Huey, which tend to work ok. Of the two I prefer DjangoQ. Database backed, don't require the Redis/Celery rigmarole.
-
What's the best thing you've learned about Django this year?
Funny, just this moment i finally switched from Celery to huey. And so far I don't regret. huey looks very promising, has good documentation and is well integrated into DJango. You should give it a try: https://github.com/coleifer/huey
-
This Week in Python
huey – a little task queue for python
-
What is your favourite task queuing framework?
Huey -> Same again?
-
5 background scheduling libraries in Python you must know
Huey: https://github.com/coleifer/huey
- Celery in production: Three more years of fixing bugs
-
Not sure if I should use celery or asyncio
I just want to add that a couple celery alternatives worth looking at include huey and dramatiq.
-
What is the best option for a (Python 3) task queue on Windows now that Celery 4 has dropped Windows support?
huey
-
Django 4.0 released
same, I ran into an issue cos of django-background-tasks. I am thinking to replace it with huey
nifi
- FLaNK Stack Weekly 19 Feb 2024
- Ask HN: What are some unpopular technologies you wish people knew more about?
- FLaNK Stack Weekly for 13 November 2023
-
Ask HN: What low code platforms are worth using?
Apache NIFI (https://nifi.apache.org/).
It uses the concept of Flow-based programming. Also its so underacknolged but this tool is very flexible. I have used as an Event Bus all the 3rd-Party Integrations.
- Apache Nifi: easy to use, powerful, reliable system to process, distribute data
- Tool decision - What architecture would you choose and why?
-
Help with choosing techstack for a new DE team
Presently setting up Apache Nifi + Apache MiNiFi for the ETL portion of my work. NiFi was easy enough to figure out; but the docs for MiNiFi have been a pain due to differences between the Java and C++ versions. I then entirely configured it with the Java version so that it was easier to search for answers for the MiNiFi yaml syntax.
-
MS SQL Change Data Capture
Found it
-
Is there something like airflow but written in Scala/Java?
Apache Camel Apache Nifi Spring Cloud
-
Json splitting and Rerouting (new to nifi)
NIFI, like most Apache projects does most of its discussion on its mailing lists, but also has a slack.
What are some alternatives?
celery - Distributed Task Queue (development branch)
Logstash - Logstash - transport and process your logs, events, or other data
rq - Simple job queues for Python
superset - Apache Superset is a Data Visualization and Data Exploration Platform
dramatiq - A fast and reliable background task processing library for Python 3.
meltano
RabbitMQ - Open source RabbitMQ: core server and tier 1 (built-in) plugins
meltano - Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
mrq - Mr. Queue - A distributed worker task queue in Python using Redis & gevent
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
django-background-tasks - A database-backed work queue for Django
Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum: