cunumeric VS ompi

Compare cunumeric vs ompi and see what are their differences.

cunumeric

An Aspiring Drop-In Replacement for NumPy at Scale (by nv-legate)

ompi

Open MPI main development repository (by open-mpi)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
cunumeric ompi
9 10
595 2,016
0.0% 1.1%
8.5 9.7
1 day ago 5 days ago
Python C
Apache License 2.0 GNU General Public License v3.0 or later
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

cunumeric

Posts with mentions or reviews of cunumeric. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-09.
  • Announcing Chapel 1.32
    6 projects | news.ycombinator.com | 9 Oct 2023
  • Is Parallel Programming Hard, and, If So, What Can You Do About It? [pdf]
    4 projects | news.ycombinator.com | 19 Feb 2023
    I am biased because this is my research area, but I have to respectfully disagree. Actor models are awful, and the only reason it's not obvious is because everything else is even more awful.

    But if you look at e.g., the recent work on task-based models, you'll see that you can have literally sequential programs that parallelize automatically. No message passing, no synchronization, no data races, no deadlocks. Read your programs as if they're sequential, and you immediately understand their semantics. Some of these systems are able to scale to thousands of nodes.

    An interesting example of this is cuNumeric, which allows you to take sequential Python programs that use NumPy, and by changing one line (the import statement), run automatically on clusters of GPUs. It is 100% pure awesomeness.

    https://github.com/nv-legate/cunumeric

    (I don't work on cuNumeric, but I do work on the runtime framework that cuNumeric uses.)

  • GPT in 60 Lines of NumPy
    9 projects | news.ycombinator.com | 9 Feb 2023
    I know this probably isn't intended for performance, but it would be fun to run this in cuNumeric [1] and see how it scales.

    [1]: https://github.com/nv-legate/cunumeric

  • Dask – a flexible library for parallel computing in Python
    8 projects | news.ycombinator.com | 17 Nov 2021
    If you want built-in GPU support (and distributed), you should check out cuNumeric (released by NVIDIA in the last week or so). Also avoids needing to manually specify chunk sizes, like it says in a sibling comment.

    https://github.com/nv-legate/cunumeric

  • Julia is the better language for extending Python
    13 projects | news.ycombinator.com | 19 Apr 2021
    Try dask

    Distribute your data and run everything as dask.delayed and then compute only at the end.

    Also check out legate.numpy from Nvidia which promises to be a drop in numpy replacement that will use all your CPU cores without any tweaks on your part.

    https://github.com/nv-legate/legate.numpy

  • Learning more about HPC as a python guy
    1 project | /r/HPC | 19 Apr 2021
    Something for the HPC tools category: https://github.com/nv-legate/legate.numpy
  • Unifying the CUDA Python Ecosystem
    13 projects | news.ycombinator.com | 16 Apr 2021
    You might be interested in Legate [1]. It supports the NumPy interface as a drop-in replacement, supports GPUs and also distributed machines. And you can see for yourself their performance results; they're not far off from hand-tuned MPI.

    [1]: https://github.com/nv-legate/legate.numpy

    Disclaimer: I work on the library Legate uses for distributed computing, but otherwise have no connection.

  • Legate NumPy: An Aspiring Drop-In Replacement for NumPy at Scale
    1 project | news.ycombinator.com | 13 Apr 2021

ompi

Posts with mentions or reviews of ompi. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-09.
  • Ask HN: Does anyone care about OpenPOWER?
    2 projects | news.ycombinator.com | 9 Feb 2024
    The commercial Linux world (see https://github.com/open-mpi/ompi/issues/4349) and other open source OSes (eg FreeBSD) seem to have lined up behind little-endian PowerPC. IBM still has a big-endian problem with AIX, IBM i, and Linux on Z.
  • Announcing Chapel 1.32
    6 projects | news.ycombinator.com | 9 Oct 2023
    Roughly, the sets of computational problems that people used (use?) MPI for. Things like numerical solvers for sparse matrices that are so big that you need to split them across your entire cluster. These still require a lot of node-to-node communication, and on top of it, the pattern is dependent on each problem (so easy solutions like map-reduce are effectively out). See eg https://www.open-mpi.org/, and https://courses.csail.mit.edu/18.337/2005/book/Lecture_08-Do... for the prototypical use case.
  • How much are you meant to comment on a code?
    1 project | /r/AskProgramming | 11 May 2023
    One of the guys at the local LUG is one of the lead maintainers of Open MPI. He told us about a comment that ran into the hundreds of lines, all for a one-line change in the code.
  • Which license to choose when you want credit
    1 project | /r/github | 12 Mar 2023
    But it would be very inconvenient to have to keep crediting everyone who's ever worked on it. If you look at old projects, their licenses can have like 10-20 of those lines (here's one I was recently looking into).
  • First True Exascale Supercomputer
    2 projects | news.ycombinator.com | 6 Jul 2022
    I have a bit of experience programming for a highly-parallel supercomputer, specifically in my case an IBM BlueGene/Q. In that case, the answer is a lot of message passing (we used Open MPI [0]). Since the nodes are discrete and don't have any shared memory, you end up with something kinda reminiscent of the actor model as popularized by Erlang and co -- but in C for number-crunching performance.

    That said, each of the nodes is itself composed of multiple cores with shared memory. So in cases where you really want to grind out performance, you actually end up using message passing to divvy up chunks of work, and then use classic pthreads to parallelize things further, with lower latency.

    Debugging is a bit of a nightmare, though, since some bugs inevitably only come up once you have a large number of nodes running the algorithm in parallel. But you'll probably be in a mainframe-style time-sharing setup, so you may have to wait hours or more to rerun things.

    This applies less to some of the newer supercomputers, which are more or less clusters of GPUs instead of clusters of CPUs. I imagine there's some commonality, but I haven't worked with any of them so I can't really say.

    [0] https://www.open-mpi.org/

  • Managing parallelism by process vs by machine
    1 project | /r/ExperiencedDevs | 30 May 2022
  • MPI + CUDA Program for thermal conductivity problem
    2 projects | /r/CUDA | 4 May 2022
    I would suggest using OpenMPI because it's pretty easy to get started with. You can build OpenMPI with CUDA support, then you can pass device pointers directly to MPI_Send and MPI_Recv. Then you don't have to deal with transfers and synchronization issues.
  • Distributed Training Made Easy with PyTorch-Ignite
    7 projects | dev.to | 10 Aug 2021
    backends from native torch distributed configuration: nccl, gloo, mpi.
  • FEA computer simulation question
    1 project | /r/buildapc | 23 Apr 2021
    I use a linux ubuntu machine with MPI (https://www.open-mpi.org/). I had a question on making my computer simulations faster. Would be better to get an older AMD 9590 machine clocked at 4.7 ghz or continue using my Ryzen 7 1700 machine clocked at something like 3.5ghz?
  • C Deep
    80 projects | dev.to | 27 Feb 2021
    OpenMPI - Message passing interface implementation. BSD-3-Clause

What are some alternatives?

When comparing cunumeric and ompi you can also consider the following projects:

cupy - NumPy & SciPy for GPU

gloo - Collective communications library with various primitives for multi-machine training.

CudaPy - CudaPy is a runtime library that lets Python programmers access NVIDIA's CUDA parallel computation API.

Redis - Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.

CUDA.jl - CUDA programming in Julia.

NCCL - Optimized primitives for collective multi-GPU communication

numba - NumPy aware dynamic Python compiler using LLVM

FlatBuffers - FlatBuffers: Memory Efficient Serialization Library

legate.pandas - An Aspiring Drop-In Replacement for Pandas at Scale

libvips - A fast image processing library with low memory needs.

grcuda - Polyglot CUDA integration for the GraalVM

SWIFT - Modern astrophysics and cosmology particle-based code. Mirror of gitlab developments at https://gitlab.cosma.dur.ac.uk/swift/swiftsim