Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →
Top 23 C++ Benchmark Projects
-
As with the assembly code analysis, let's look at two types of operations: sequentially filling an array with random numbers, and randomly accessing arbitrary array elements when we pass it to a function. To compile the code, we used GCC and Clang. We checked both debug (with the -DDEBUG flag) and release builds (with the -O3 level optimizations and the -DNDEBUG flag). And of course, we used Google Benchmark to run the tests.
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
Project mention: How much traffic can a pre-rendered Next.js site handle? | news.ycombinator.com | 2025-03-08
I have also found that Next.js is shockingly slow.
I recently added some benchmarks to the TechEmpower Web Framework Benchmarks suite, and Next.js ranked near dead last, even for simple JSON API endpoint (i.e. no React SSR involved): https://www.techempower.com/benchmarks/#section=data-r23&hw=...
I discussed it with a couple of Next.js maintainers (https://github.com/vercel/next.js/discussions/75930), and they indicated that it's only a problem for "standalone" deployments (i.e. not on Vercel). However, I'm not entirely convinced that is true. I wonder if there are major optimizations that could be made to, for example, the routing system.
-
FluidX3D
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
microservices-framework-benchmark
Raw benchmarks on throughput, latency and transfer of Hello World on popular microservices frameworks
-
alpaca
Serialization library written in C++17 - Pack C++ structs into a compact byte-array without any macros or boilerplate code (by p-ranav)
-
less_slow.cpp
Learning how to write "Less Slow" code in C++ 20, C 99, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Project mention: DeepGEMM: Clean and efficient FP8 GEMM kernels with fine-grained scaling | news.ycombinator.com | 2025-02-25I generally avoid FP8 and prefer I8, but your question got me wondering how well cuBLAS performs.
First of all, cuBLAS needs the cuBLASLt extension API for mixed-precision workloads to handle FP8. Second, some adequate type combinations, like E5M2 x E5M2 for A x B, are not supported, while others, like E5M2 x E4M3, are! Moreover, matrix A must always come in a transposed layout for Ampere, Hopper, and Blackwell... and the list of constraints goes on.
I've integrated FP8 cuBLASLt benchmarks into my "Less Slow C++" repository <https://github.com/ashvardanian/less_slow.cpp>, adding to the list of existing cuBLAS and hand-rolled CUDA and PTX benchmarks. I'm running them on H200 GPUs, which should have the same performance as H100. For square inputs, the throughput peaks around 1.35 Peta-ops.
That's around 67% of the advertised number for dense GEMM <https://resources.nvidia.com/en-us-data-center-overview-mc/e...>.
-
mixbench
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
-
-
-
-
-
-
-
-
elbencho
A distributed storage benchmark for file systems, object stores & block devices with support for GPUs
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C++ Benchmark discussion
C++ Benchmark related posts
-
How bloom filters made SQLite 10x faster
-
Good research of Java (JIT) vs. C++ (AOT) performance with interesting results
-
Jaws – a JavaScript to WASM ahead of time compiler
-
std::array in C++ isn't slower than array in C
-
Yes, Ruby is fast, but…
-
How can I check the execution time of a program rendered in SFML?
-
How to Perf profile functions?
-
A note from our sponsor - CodeRabbit
coderabbit.ai | 25 Mar 2025
Index
What are some of the best open-source Benchmark projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | benchmark | 9,330 |
2 | FrameworkBenchmarks | 7,759 |
3 | FluidX3D | 4,306 |
4 | coost | 4,071 |
5 | cista | 1,965 |
6 | nanobench | 1,510 |
7 | ut | 1,312 |
8 | Celero | 841 |
9 | microservices-framework-benchmark | 707 |
10 | alpaca | 497 |
11 | less_slow.cpp | 495 |
12 | mixbench | 389 |
13 | BabelStream | 330 |
14 | map_benchmark | 301 |
15 | ecs_benchmark | 246 |
16 | uVkCompute | 234 |
17 | ubench.h | 230 |
18 | libCacheSim | 207 |
19 | OpenCL-Benchmark | 193 |
20 | elbencho | 189 |
21 | benchmarking-fft | 145 |
22 | c2clat | 143 |
23 | math-parser-benchmark-project | 140 |