Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!
Why do you think that https://github.com/preda/gpuowl is a good alternative to ParallelReductionsBenchmark
Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!
Why do you think that https://github.com/preda/gpuowl is a good alternative to ParallelReductionsBenchmark