C++ parallel-computing

Open-source C++ projects categorized as parallel-computing

Top 23 C++ parallel-computing Projects

parallel-computing
  1. Taskflow

    A General-purpose Task-parallel Programming System using Modern C++

    Project mention: Show HN: Coros – A Modern C++ Library for Task Parallelism | news.ycombinator.com | 2024-09-25

    Martin, have you had a look at https://github.com/taskflow/taskflow ?

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. CTranslate2

    Fast inference engine for Transformer models

    Project mention: Brood War Korean Translations | news.ycombinator.com | 2025-01-17

    Thanks for the added context on the builds! As "foreign" BW player and fellow speech processing researcher, I agree shallow contextual biasing should help. While not difficult to implement, most generally available ASR solutions don't make it easy to use. There's a PR in ctranslate2 implementing the same feature so that it could be exposed in faster-whisper: https://github.com/OpenNMT/CTranslate2/pull/1789

  4. kokkos

    Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction

  5. mfem

    Lightweight, general, scalable C++ library for finite element methods

  6. cccl

    CUDA Core Compute Libraries

    Project mention: Learning Assembly for Fun, Performance and Profit | news.ycombinator.com | 2025-04-12

    So I would say skill at GPU assembly is in-demand for the elite tier of GPU performance work. Not necessarily writing much of it (though see [1] for an example, this is the kernel of multisplit as used in Nvidia's Onesweep implementation), but definitely in being able to read it so you can understand what the compiled code is actually doing. I'll also cite as evidence of that the incredible work of the engineers on Nanite. They describe writing the core of the microtriangle software renderer in HLSL but analyzing the assembler output to optimize down to the cycle level, as described in their "deep dive into Nanite virtualized geometry" talk (timestamp points to the reference to instruction-level micro-optimization).

    [1]: https://github.com/NVIDIA/cccl/blob/2d1fa6bc9235106740d9373c...

    [2]: https://www.youtube.com/watch?v=eviSykqSUUw&t=2073s

  7. Vc

    SIMD Vector Classes for C++

    Project mention: Understanding SIMD: Infinite Complexity of Trivial Problems | news.ycombinator.com | 2024-11-30

    I'm surprised no one has mentioned Vc. I found ispc clunky and not as performant, and std::simd didn't support some useful math ops like rsqrt. Vc has been around for years, I have no trouble including it in my codes, it has masking and many of the most useful math ops, and I can get over 1 TF/s on a consumer-grade Ryzen and at least 3 TF/s on the big Epyc CPUs.

    https://github.com/VcDevel/Vc

  8. Kratos

    Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface. (by KratosMultiphysics)

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. dolfinx

    Next generation FEniCS problem solving environment

  11. libfork

    A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines

  12. oneMath

    oneAPI Math Library (oneMath)

  13. RAJA

    RAJA Performance Portability Layer (C++)

  14. parlaylib

    A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines

  15. coros

    An easy-to-use and fast library for task-based parallelism, utilizing coroutines. (by mtmucha)

    Project mention: Show HN: Coros – A Modern C++ Library for Task Parallelism | news.ycombinator.com | 2024-09-25

    In your dequeue/circular buffer implementation, how is it able to grow the queue without locking?

    The code seems to rely on atomics for head & tail, but grows the queue without any special provisions I can see.

    https://github.com/mtmucha/coros/blob/ee30d3c1d0602c3071aa26...

  16. feelpp

    :gem: Feel++: Finite Element Embedded Language and Library in C++

  17. PothosCore

    The Pothos data-flow framework

  18. areg-sdk

    AREG is a cross-platform asynchronous Object RPC framework to simplify multitasking programming by blurring borders between processes and treating remote objects as if they coexist in the same thread.

  19. CPURasterizer

    CPU Based Rasterizer Engine

  20. ConcurrentDeque

    Fast, generalized, implementation of the Chase-Lev lock-free work-stealing deque for C++17

  21. cppRouting

    Algorithms for Routing and Solving the Traffic Assignment Problem

  22. Lazy

    Light-weight header-only library for parallel function calls and continuations in C++ based on Eric Niebler's talk at CppCon 2019.

  23. Bulk

    A modern interface for implementing bulk-synchronous parallel programs.

  24. parallel-dfs-dag

    A parallel implementation of DFS for Directed Acyclic Graphs (https://research.nvidia.com/publication/parallel-depth-first-search-directed-acyclic-graphs)

  25. libGPGPU

    Multi-GPU & CPU OpenCL kernel executor with load-balancing as if there is one big GPU.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ parallel-computing discussion

Log in or Post with

C++ parallel-computing related posts

  • Show HN: Coros – A Modern C++ Library for Task Parallelism

    8 projects | news.ycombinator.com | 25 Sep 2024
  • rodin alternatives - mfem and FreeFem-sources

    7 projects | 8 Mar 2023
  • Learn PDE constrained optimization

    2 projects | /r/math | 31 Jan 2023
  • Open source FEA tools instead of ANSYS Workbench and APDL

    2 projects | /r/fea | 25 Jan 2023
  • Eighty Years of the Finite Element Method: Birth, Evolution, and Future

    2 projects | news.ycombinator.com | 5 Nov 2022
  • Fortran on GPU

    4 projects | /r/fortran | 21 Oct 2022
  • Best Python package(s) to solve PDEs numerically?

    1 project | /r/computationalphysics | 14 Oct 2022
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 21 May 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source parallel-computing projects in C++? This list will help you:

# Project Stars
1 Taskflow 10,848
2 CTranslate2 3,799
3 kokkos 2,192
4 mfem 1,889
5 cccl 1,637
6 Vc 1,485
7 Kratos 1,107
8 dolfinx 895
9 libfork 700
10 oneMath 675
11 RAJA 519
12 parlaylib 363
13 coros 323
14 feelpp 320
15 PothosCore 312
16 areg-sdk 295
17 CPURasterizer 179
18 ConcurrentDeque 144
19 cppRouting 114
20 Lazy 112
21 Bulk 94
22 parallel-dfs-dag 50
23 libGPGPU 11

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com