elbencho
ParallelReductionsBenchmark
Our great sponsors
elbencho | ParallelReductionsBenchmark | |
---|---|---|
2 | 2 | |
146 | 59 | |
- | - | |
7.5 | 4.6 | |
25 days ago | 5 months ago | |
C++ | C++ | |
GNU General Public License v3.0 only | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
elbencho
-
[HELP] Nvidia GPUDirect storage benchmark for an AI400 system
You can also use elbencho (https://github.com/breuner/elbencho) which is functionally equivalent to IOR but a little more flexible.
-
WD Black SN850 1 TB SSD Review - The Fastest SSD
I won't speak for him but Tallis is definitely aware (see his recent article on updated testing) and so are others. I regularly work with Sean Webster of Tom's Hardware (/u/TurboSSD) and we spend an insane amount of time working around SLC cache response and discussing it on my discord. These guys often use different tools (e.g. Iometer vs. FIO) although the one I've been playing with moving forward is elbencho. Either way, it's something that takes up a lot of time in SSD reviewer circles since it's a relatively tightknit group. It's a challenging topic especially as SLC caching algorithms are getting more complex, with behavioral and performance-based profiles and reviewers already doing preconditioning.
ParallelReductionsBenchmark
-
Failing to Reach 204 GB/S DDR4 Bandwidth
For the single threaded version, they have a data hazard on the sums that could be smoothed out with a little loop unrolling and separate variables.
But in the [threaded version](https://github.com/unum-cloud/ParallelReductions/blob/fd16d9...) they have separate slots for an accumulator but it's still in a shared vector, which most likely has the issue I described.
What are some alternatives?
CrystalDiskInfo - CrystalDiskInfo
MatX - An efficient C++17 GPU numerical computing library with Python-like syntax
oneflow - OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
ispc - IntelĀ® Implicit SPMD Program Compiler
beatmup - Beatmup: image and signal processing library
gpuowl - GPU Mersenne primality test.
cubefs - cloud-native file store
alpaka - Abstraction Library for Parallel Kernel Acceleration :llama:
GLSL-PathTracer - A toy physically based GPU path tracer (C++/OpenGL/GLSL)
cuda_memtest - Fork of CUDA GPU memtest :eyeglasses:
sedutil - Use sedutil for setting up and using self encrypting drives (SEDs) that comply with the TCG OPAL 2.00 standard. This includes the requisite pre-boot authentication image.
amgcl - C++ library for solving large sparse linear systems with algebraic multigrid method