ParallelReductionsBenchmark vs ispc

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

ParallelReductionsBenchmark		ispc
	Project
2	Mentions	4
59	Stars	2,405
-	Growth	1.2%
4.6	Activity	9.5
5 months ago	Latest Commit	5 days ago
C++	Language	C++
-	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

ParallelReductionsBenchmark

Posts with mentions or reviews of ParallelReductionsBenchmark. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-02-02.

Failing to Reach 204 GB/S DDR4 Bandwidth
3 projects | news.ycombinator.com | 2 Feb 2022

For the single threaded version, they have a data hazard on the sums that could be smoothed out with a little loop unrolling and separate variables.
But in the [threaded version](https://github.com/unum-cloud/ParallelReductions/blob/fd16d9...) they have separate slots for an accumulator but it's still in a shared vector, which most likely has the issue I described.

ispc

Posts with mentions or reviews of ispc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-14.

Implementing a GPU's Programming Model on a CPU
2 projects | news.ycombinator.com | 14 Oct 2023

This so-called GPU programming model has existed many decades before the appearance of the first GPUs, but at that time the compilers were not so good like the CUDA compilers, so the burden for a programmer was greater.
As another poster has already mentioned, there exists a compiler for CPUs which has been inspired by CUDA and which has been available for many years: ISPC (Implicit SPMD Program Compiler), at https://github.com/ispc/ispc .
NVIDIA has the very annoying habit of using a lot of terms that are different from those that have been previously used in computer science for decades. The worst is that NVIDIA has not invented new words, but they have frequently reused words that have been widely used with other meanings.
SIMT (Single-Instruction Multiple Thread) is not the worst term coined by NVIDIA, but there was no need for yet another acronym. For instance they could have used SPMD (Single Program, Multiple Data Stream), which dates from 1988, two decades before CUDA.
Moreover, SIMT is the same thing that was called "array of processes" by C.A.R. Hoare in August 1978 (in "Communicating Sequential Processes"), or "replicated parallel" by Occam in 1985 or "PARALLEL DO" by "OpenMP Fortran" in 1997-10 or "parallel for" by "OpenMP C and C++" in 1998-10.
The only (but extremely important) innovation brought by CUDA is that the compiler is smart enough so that the programmer does not need to know the structure of the processor, i.e. how many cores it has and how many SIMD lanes has each core. The CUDA compiler distributes automatically the work over the available SIMD lanes and available cores and in most cases the programmer does not care whether two executions of the function that must be executed for each data item are done on two different cores or on two different SIMD lanes of the same core.
SIMD intrinsics and the possibility of a standard library solution
16 projects | /r/cpp | 8 Jan 2023

ISPC: https://github.com/ispc/ispc
Prefix Sum with SIMD
2 projects | news.ycombinator.com | 12 Feb 2022
Have you looked at [ISPC - Intel SPMD Program Compiler][0]?
```
  [0]: https://github.com/ispc/ispc
```
Duff’s Device in 2021
3 projects | news.ycombinator.com | 18 Nov 2021

What are some alternatives?

When comparing ParallelReductionsBenchmark and ispc you can also consider the following projects:

MatX - An efficient C++17 GPU numerical computing library with Python-like syntax

highway - Performance-portable, length-agnostic SIMD with runtime dispatch

gpuowl - GPU Mersenne primality test.

Beef - Beef Programming Language

alpaka - Abstraction Library for Parallel Kernel Acceleration :llama:

micro-profiler - Cross-platform low-footprint realtime C/C++ Profiler

cuda_memtest - Fork of CUDA GPU memtest :eyeglasses:

elena-lang - ELENA is a general-purpose language with late binding. It is multi-paradigm, combining features of functional and object-oriented programming. Rich set of tools are provided to deal with message dispatching : multi-methods, message qualifying, generic message handlers, run-time interfaces

eaminer - Heterogeneous Ethereum Miner with support for AMD, Intel and Nvidia GPUs using SYCL, OpenCL and CUDA backends

lunix - Lua Unix Module.

relion - Image-processing software for cryo-electron microscopy

eve - Expressive Vector Engine - SIMD in C++ Goes Brrrr

ParallelReductionsBenchmark vs MatX ispc vs highway ParallelReductionsBenchmark vs gpuowl ispc vs Beef ParallelReductionsBenchmark vs alpaka ispc vs micro-profiler ParallelReductionsBenchmark vs cuda_memtest ispc vs elena-lang ParallelReductionsBenchmark vs eaminer ispc vs lunix ParallelReductionsBenchmark vs relion ispc vs eve

Compare ParallelReductionsBenchmark vs ispc and see what are their differences.

ParallelReductionsBenchmark

ispc

ParallelReductionsBenchmark

ispc

What are some alternatives?