First True Exascale Supercomputer

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • ompi

    Open MPI main development repository

  • I have a bit of experience programming for a highly-parallel supercomputer, specifically in my case an IBM BlueGene/Q. In that case, the answer is a lot of message passing (we used Open MPI [0]). Since the nodes are discrete and don't have any shared memory, you end up with something kinda reminiscent of the actor model as popularized by Erlang and co -- but in C for number-crunching performance.

    That said, each of the nodes is itself composed of multiple cores with shared memory. So in cases where you really want to grind out performance, you actually end up using message passing to divvy up chunks of work, and then use classic pthreads to parallelize things further, with lower latency.

    Debugging is a bit of a nightmare, though, since some bugs inevitably only come up once you have a large number of nodes running the algorithm in parallel. But you'll probably be in a mainframe-style time-sharing setup, so you may have to wait hours or more to rerun things.

    This applies less to some of the newer supercomputers, which are more or less clusters of GPUs instead of clusters of CPUs. I imagine there's some commonality, but I haven't worked with any of them so I can't really say.

    [0] https://www.open-mpi.org/

  • AMDGPU.jl

    AMD GPU (ROCm) programming in Julia

  • This is exciting news! What's also exciting is that it's not just C++ that can run on this supercomputer; there is also good (currently unofficial) support for programming those GPUs from Julia, via the AMDGPU.jl library (note: I am the author/maintainer of this library). Some of our users have been able to run AMDGPU.jl's testsuite on the Crusher test system (which is an attached testing system with the same hardware configuration as Frontier), as well as their own domain-specific programs that use AMDGPU.jl.

    What's nice about programming GPUs in Julia is that you can write code once and execute it on multiple kinds of GPUs, with excellent performance. The KernelAbstractions.jl library makes this possible for compute kernels by acting as a frontend to AMDGPU.jl, CUDA.jl, and soon Metal.jl and oneAPI.jl, allowing a single piece of code to be portable to AMD, NVIDIA, Intel, and Apple GPUs, and also CPUs. Similarly, the GPUArrays.jl library allows the same behavior for idiomatic array operations, and will automatically dispatch calls to BLAS, FFT, RNG, linear solver, and DNN vendor-provided libraries when appropriate.

    I'm personally looking forward to helping researchers get their Julia code up and running on Frontier so that we can push scientific computing to the max!

    Library link: <https://github.com/JuliaGPU/AMDGPU.jl>

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts