Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 12 slurm Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
toil
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
slurm-mail
Slurm-Mail is a drop in replacement for Slurm's e-mails to give users much more information about their jobs compared to the standard Slurm e-mails.
-
soopervisor
☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.
-
slurm-monitoring-public
Monitor your high performance infrastructure configured over slurm using TIG stack
> It's been a while since you can rerun/resume Nextflow pipelines
Yes, you can resume, but you need your whole upstream DAG to be present. Snakemake can rerun a job when only the dependencies of that job are present, which allows to neatly manage the disk usage, or archive an intermediate state of a project and rerun things from there.
> and yes, you can have dry runs in Nextflow
You have stubs, which really isn't the same thing.
> I have no idea what you're referring to with the 'arbitrary limit of 1000 parallel jobs' though
I was referring to this issue: https://github.com/nextflow-io/nextflow/issues/1871. Except, the discussion doesn't give the issue a full justice. Nextflow spans each job in a separate thread, and when it tries to span 1000+ condor jobs it die with a cryptic error message. The option of -Dnxf.pool.type=sync and -Dnxf.pool.maxThreads=N prevents the ability to resume and attempts to rerun the pipeline.
> As for deleting temporary files, there are features that allow you to do a few things related to that, and other features being implemented.
There are some hacks for this - but nothing I would feel safe to integrate into a production tool. They are implementing something - you're right - and it's been the case for several years now, so we'll see.
Snakemake has all that out of the box.
I'm trying to have Slurm automatically switch partitions to a specific one via the job_sutmit.lua plugin whenever our users request strictly more than 8 cpus. But trying to extract or calculate ahead of time how many cpus will be allocated or requested isn't trivial (to me). Are there attributes in job_submit that could help out with this task? For example, I don't see any job->desc.ntasks attribute in https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/lua/job_submit_lua.c. Any information or documentation on how to leverage job_submit.lua would be appreciated.
Project mention: Show HN: Hatchet – Open-source distributed task queue | news.ycombinator.com | 2024-03-08a little late now, but I wonder if https://github.com/DataBiosphere/toil might meet your requirements
Project mention: Sequential and parallel execution of long-running shell commands | news.ycombinator.com | 2024-03-20A similar tool I highly recommend: https://github.com/justanhduc/task-spooler
At first I thought it would just be a one-off tool I used for one of my projects, not until I discovered later that it has everything I need and became my daily driver ever since.
Since I am unsure how your HPC admin implemented email sending, you might want to talk to your HPC administrator and see if they could implement something like slurm-mail. It has been working very well for the my managed cluster so far and is very customisable based on users' need.
slurm related posts
-
Sequential and parallel execution of long-running shell commands
-
Accelerators
-
The AI Battlefield Engineering – What You Need to Know
-
GNU Parallel, where have you been all my life?
-
What are some lesser-known Linux software that are absolutely life changing?
-
A note from our sponsor - InfluxDB
www.influxdata.com | 29 May 2024
Index
What are some of the best open-source slurm projects? This list will help you:
Project | Stars | |
---|---|---|
1 | ml-engineering | 9,995 |
2 | nextflow | 2,568 |
3 | slurm | 2,377 |
4 | toil | 874 |
5 | slurm-docker-cluster | 263 |
6 | task-spooler | 235 |
7 | smk-simple-slurm | 119 |
8 | slurm-mail | 84 |
9 | stui | 71 |
10 | turm | 53 |
11 | soopervisor | 43 |
12 | slurm-monitoring-public | 17 |
Sponsored