armada
ohpc
Our great sponsors
armada | ohpc | |
---|---|---|
8 | 28 | |
414 | 821 | |
5.1% | 1.8% | |
9.7 | 9.5 | |
5 days ago | 4 days ago | |
Go | C | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
armada
-
job scheduling for scientific computing on k8s?
Armada could be an alternative: https://armadaproject.io/
-
OpenAI, Scaling Kubernetes to 7,500 nodes
To overcome the limitations on cluster size in Kubernetes, folks may want to look at the Armada Project ( https://armadaproject.io/ ). Armada is a
- Kubernetes was never designed for batch jobs
-
Kubernetes Was Never Designed for Batch Jobs
Another aspect of batch jobs is that we’ll often want to run distributed computations where we split our data into chunks and run a function on each chunk. One popular option is to run Spark, which is built for exactly this use case, on top of Kubernetes. And there are other options for additional software to make running distributed computations on Kubernetes easier.
- Armada
-
Karmada: Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
The naming sounds very similar to this project: https://github.com/G-Research/armada
- Queue batch job that would exceed namespace quota.
ohpc
- interesting read
-
Rocky strikes back at Red Hat
We have plenty of licensed RHEL, but in isolated environments the hurdle of connecting to a Satellite server or their subscription hub on the internet is too high -- at least with Rocky and the ilk available. For this set up, the licensing model doesn't match reality, at least not easily.
Are we really going to build out compatible configuration management, monitoring, logging, etc? -- it's not a seamless transition. How much time do we have to put towards this?
And yes -- there is software compatibility issues. Look at the OpenHPC software distribution, it's designed for SUSE or Enterprise Linux: https://github.com/openhpc/ohpc/wiki/2.X
-
job scheduling for scientific computing on k8s?
I recommend you just stick with HPC centric tools are workflows. Your scientists aren’t going to learn k8s as you said. SLURM is the scheduler you want and if you’re new to HPC, I recommend taking a look at https://openhpc.community
-
HPC usage etiquette.
the general consensus is that pam_slurm_adopt is the better module (that's just one dude's opinion but his citations are good) - the advantage is that not only will it gatekeep SSH access, it'll also drop their SSH session into the cgroups that are constraining the user's resource limits, which also means their CPU usage will show up in sacct for the job (if the user has multiple jobs running on a node their ssh session may get dropped into the wrong one, no help for that)
- HPC OS for Non-expert
- How useful/important is OpenStack for HPC?
- Wanting to setup a cluster
-
Essential skills for new HPC Admin?
Check this: https://openhpc.community/ (this helped me a lot when I started. I'm no longer the admin of such systems)
-
Looking to optimize research lab resources...
Overall, if you're already in a RedHat-based environment, an installation of OpenHPC is pretty straightforward. Their reference implementation assumes you have a head node for the scheduler that all other nodes NAT through, but that's not a 100% requirement as much as a common setup. It also assumes you can reformat the compute nodes and dedicate them to HPC work, so if you need to keep the systems available as normal workstations, you'll need to deviate a bit. You could also use the OpenHPC instructions as a guide for what packages to install, but it may take longer to get everything right.
-
xcat education ?
https://github.com/openhpc/ohpc/wiki/1.3.X Newer versions of OpenHPC don't seem to releasing XCat guides anymore unfortunately.
What are some alternatives?
karmada - Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
spack - A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
kube-batch - A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
slurm - Slurm: A Highly Scalable Workload Manager
madaidans-insecurities.github.io
EasyBuild - EasyBuild - building software with ease
argo - Workflow Engine for Kubernetes
openpbs - An HPC workload manager and job scheduler for desktops, clusters, and clouds.
kueue - Kubernetes-native Job Queueing
deepops - Tools for building GPU clusters
magic-wormhole - get things from one computer to another, safely [Moved to: https://github.com/magic-wormhole/magic-wormhole]
infrastructure - The infrastructure monorepo for the Rocky Linux project. This project will be archived/deprecated in the future.