NCCL VS C++ Actor Framework

Compare NCCL vs C++ Actor Framework and see what are their differences.

NCCL

Optimized primitives for collective multi-GPU communication (by NVIDIA)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
NCCL C++ Actor Framework
3 4
2,796 3,092
3.5% 0.9%
5.9 9.8
2 days ago 12 days ago
C++ C++
GNU General Public License v3.0 or later BSD 3-clause "New" or "Revised" License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

NCCL

Posts with mentions or reviews of NCCL. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-06.
  • MPI jobs to test
    2 projects | /r/HPC | 6 Jun 2023
    % rm -rf /tmp/nccl ; git clone --recursive https://github.com/NVIDIA/nccl.git ; cd nccl ; git grep MPI Cloning into 'nccl'... remote: Enumerating objects: 2769, done. remote: Counting objects: 100% (336/336), done. remote: Compressing objects: 100% (140/140), done. remote: Total 2769 (delta 201), reused 287 (delta 196), pack-reused 2433 Receiving objects: 100% (2769/2769), 3.04 MiB | 3.37 MiB/s, done. Resolving deltas: 100% (1820/1820), done. README.md:NCCL (pronounced "Nickel") is a stand-alone library of standard communication routines for GPUs, implementing all-reduce, all-gather, reduce, broadcast, reduce-scatter, as well as any send/receive based communication pattern. It has been optimized to achieve high bandwidth on platforms using PCIe, NVLink, NVswitch, as well as networking using InfiniBand Verbs or TCP/IP sockets. NCCL supports an arbitrary number of GPUs installed in a single node or across multiple nodes, and can be used in either single- or multi-process (e.g., MPI) applications. src/collectives/broadcast.cc:/* Deprecated original "in place" function, similar to MPI */
  • NVLink and Dual 3090s
    1 project | /r/nvidia | 4 May 2022
    If it's rendering, you don't really need SLI, you need to install NCCL so that GPUs memory can be pooled: https://github.com/NVIDIA/nccl
  • Distributed Training Made Easy with PyTorch-Ignite
    7 projects | dev.to | 10 Aug 2021
    backends from native torch distributed configuration: nccl, gloo, mpi.

C++ Actor Framework

Posts with mentions or reviews of C++ Actor Framework. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-04.
  • C++ Jobs - Q3 2023
    3 projects | /r/cpp | 4 Jul 2023
    CAF
  • Actor system for the JVM developed by Electronic Arts
    6 projects | news.ycombinator.com | 28 Apr 2022
    I'd like to mention the native actor model implementation CAF, the C++ Actor Framework, and share some experiences. (Disclaimer: I've been developing on CAF in the past and have a good relationship with the creator.) CAF (1) provides native actors without an VM layer, (2) type-safe interfaces so that the compiler yells at you when a receiver cannot handle a message, and (3) transparent copy-on-write messaging so that you can still push stuff through pipelines and induce only copies only when a ref count is greater than one.

    In our telemetry engine VAST, we've been using CAF successfully for several years for building a distributed system that always has a saturated write path. CAF provides a credit-based streaming abstraction as well, so that you can have backpressure across a chain of actors, making burst-induced OOM issues a blast from the past. You also get all the other benefits of actors, like linking and monitoring, to achieve well-defined failure semantics: either be up and running or collectively fail, but still allowing for local recovery—except for segfaults, this is where "native" has a disadvantage over VM-based actor models.

    With CAF's network transparent runtime, a message ender doesn't need to know where receiver lives; the runtime either passes the message as COW pointer to the receiver or serializes it transparently. Other actor model runtimes support that as well, but I'm mentioning it because our experience showed that this is great value: we can can slice and dice our actors based on the deployment target, e.g., execute the application in one single process (e.g., for a beefy box) or wrap actors into single OS processes (e.g., when deploying on container auto-scalers).

    The deep integration with the C++ type system allowed us to define very stable RPC-like interfaces. We're currently designing a pub/sub layer as alternate access path, because users are interested in tapping into streaming feeds selectively. This is not easy, because request-response and pub/sub are two ends of a spectrum, but it turns out we can support nicely with CAF.

    Resources:

    - CAF: https://github.com/actor-framework/actor-framework

    - VAST: https://tenzir.github.io/vast/docs/understand-vast/actor-mod... (sorry for the incompleteness, we're in migration mode from the old docs, but this page is summarizing the benefits of CAF for us best)

    - Good general actor model background: http://dist-prog-book.com/chapter/3/message-passing.html#why...

  • C++ Jobs - Q2 2022
    4 projects | /r/cpp | 3 Apr 2022
    VAST is a flight recorder and security content execution engine. On the one hand, there exists a continuous stream of high-volume data sources (such as network telemetry as NetFlow, Zeek, Suricata, and endpoint telemetry). On the other hand, VAST processes needle-in-haystack queries to provide answers to questions like "has this threat been relevant to us 8 months ago?", and supports threat hunters with an interactive query capability to explore the data. From an engineering perspective, we focus especially on the separation of read and write path, concurrent message passing in an actor model runtime (CAF), and leveraging open standards, like Apache Arrow, to establish a high-bandwidth data plane for sharing data with downstream tooling. A flexible plugin API enables additional security-specific use cases on top, such as realtime matching of threat intelligence or mining of asset data for passive inventorization.
  • C++ Jobs - Q4 2021
    4 projects | /r/cpp | 2 Oct 2021
    Technologies: Apache Arrow, Flatbuffers, C++ Actor Framework, Linux, Docker, Kubernetes

What are some alternatives?

When comparing NCCL and C++ Actor Framework you can also consider the following projects:

gloo - Collective communications library with various primitives for multi-machine training.

Boost.Asio - Asio C++ Library

Thrust - [ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

libuv - Cross-platform asynchronous I/O

HPX - The C++ Standard Library for Parallelism and Concurrency

libevent - Event notification library

xla - Enabling PyTorch on XLA Devices (e.g. Google TPU)

rotor - Event loop friendly C++ actor micro-framework, supervisable

Easy Creation of GnuPlot Scripts from C++ - A simple C++17 lib that helps you to quickly plot your data with GnuPlot

Taskflow - A General-purpose Parallel and Heterogeneous Task Programming System

libev - Full-featured high-performance event loop loosely modelled after libevent