gpu_clock_stabilizer
ParallelReductionsBenchmark
gpu_clock_stabilizer | ParallelReductionsBenchmark | |
---|---|---|
1 | 2 | |
6 | 59 | |
- | - | |
0.0 | 4.6 | |
over 1 year ago | 5 months ago | |
C++ | C++ | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gpu_clock_stabilizer
-
GPU clock stabilizer for consistent GPU profiling on Windows
I've just released a simple GPU clock stabilizer used for consistent GPU profiling on Windows. It allows more deterministic timestamp query results on modern graphics APIs used for calculating elapsed GPU time at the expense of lower performance. For any questions you can find me on twitter.
ParallelReductionsBenchmark
-
Failing to Reach 204 GB/S DDR4 Bandwidth
For the single threaded version, they have a data hazard on the sums that could be smoothed out with a little loop unrolling and separate variables.
But in the [threaded version](https://github.com/unum-cloud/ParallelReductions/blob/fd16d9...) they have separate slots for an accumulator but it's still in a shared vector, which most likely has the issue I described.
What are some alternatives?
libcudacxx - [ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl
MatX - An efficient C++17 GPU numerical computing library with Python-like syntax
takedetour - A template (and a sample) for writing tracers on Windows. Based on the Detours library.
ispc - IntelĀ® Implicit SPMD Program Compiler
GLSL-PathTracer - A toy physically based GPU path tracer (C++/OpenGL/GLSL)
gpuowl - GPU Mersenne primality test.
Thrust - [ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
alpaka - Abstraction Library for Parallel Kernel Acceleration :llama:
cuda-api-wrappers - Thin C++-flavored header-only wrappers for core CUDA APIs: Runtime, Driver, NVRTC, NVTX.
cuda_memtest - Fork of CUDA GPU memtest :eyeglasses:
eaminer - Heterogeneous Ethereum Miner with support for AMD, Intel and Nvidia GPUs using SYCL, OpenCL and CUDA backends
relion - Image-processing software for cryo-electron microscopy