ParallelReductionsBenchmark Alternatives

Similar projects and alternatives to ParallelReductionsBenchmark

MatX

7 1,115 9.1 C++ ParallelReductionsBenchmark VS MatX

An efficient C++17 GPU numerical computing library with Python-like syntax
ispc

4 2,396 9.5 C++ ParallelReductionsBenchmark VS ispc

Intel® Implicit SPMD Program Compiler
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
gpuowl

1 109 7.7 C++ ParallelReductionsBenchmark VS gpuowl

GPU Mersenne primality test.
alpaka

1 321 9.3 C++ ParallelReductionsBenchmark VS alpaka

Abstraction Library for Parallel Kernel Acceleration :llama: (by alpaka-group)
cuda_memtest

2 107 3.6 C++ ParallelReductionsBenchmark VS cuda_memtest

Fork of CUDA GPU memtest :eyeglasses:
amgcl

1 702 3.9 C++ ParallelReductionsBenchmark VS amgcl

C++ library for solving large sparse linear systems with algebraic multigrid method
eaminer

1 4 1.8 C++ ParallelReductionsBenchmark VS eaminer

Heterogeneous Ethereum Miner with support for AMD, Intel and Nvidia GPUs using SYCL, OpenCL and CUDA backends
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
relion

1 423 6.2 C++ ParallelReductionsBenchmark VS relion

Image-processing software for cryo-electron microscopy
laser

6 260 3.6 Nim ParallelReductionsBenchmark VS laser

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers (by mratsim)
numactl

1 379 8.1 C ParallelReductionsBenchmark VS numactl

NUMA support for Linux
vuda

6 832 3.2 C++ ParallelReductionsBenchmark VS vuda

VUDA is a header-only library based on Vulkan that provides a CUDA Runtime API interface for writing GPU-accelerated applications.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better ParallelReductionsBenchmark alternative or higher similarity.

Suggest an alternative to ParallelReductionsBenchmark

ParallelReductionsBenchmark reviews and mentions

Posts with mentions or reviews of ParallelReductionsBenchmark. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-02-02.

Failing to Reach 204 GB/S DDR4 Bandwidth
3 projects | news.ycombinator.com | 2 Feb 2022

For the single threaded version, they have a data hazard on the sums that could be smoothed out with a little loop unrolling and separate variables.
But in the [threaded version](https://github.com/unum-cloud/ParallelReductions/blob/fd16d9...) they have separate slots for an accumulator but it's still in a shared vector, which most likely has the issue I described.

Stats

Basic ParallelReductionsBenchmark repo stats

Mentions

Stars

Activity

4.6

Last Commit

5 months ago

The primary programming language of ParallelReductionsBenchmark is C++.