Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more! Learn more →
Top 23 C++ Performance Projects
-
With fwrite that would be another level of buffering in addition to FILE's buffer. If you are interested in what {fmt} is doing, a good starting point is https://github.com/fmtlib/fmt/blob/35dcc58263d6b55419a5932bd.... It is also possible to bypass stdio completely and get even faster output (https://vitaut.net/posts/2020/optimal-file-buffer-size/) and while it is great for files, it may introduce interleaving problems with things like stdout.
-
JetBrains
Tell us how you use coding tools. You may win a prize! Are you a developer or a data analyst? Share your thoughts about your coding tools in our short survey and get a chance to win prizes!
-
have you looked at tracy : https://github.com/wolfpld/tracy ?
it seems to at par, if not better than the other offering.
-
Project mention: Website Is Served from Nine Neovim Buffers on My Old ThinkPad | news.ycombinator.com | 2025-08-18
-
Yep. We’re in a situation where C-like languages couple layout and access interface very tightly. But, now cache is such an overriding issue in optimization, you really want to rapidly experiment with different layouts without rewriting your whole algorithm every time. AOS, SOA, AOSOA, hot/cold data for different stages, etc…
Jon Blow’s Jai language famously added a feature to references that allowed you to easily experiment with moving data members between hot/cold arrays of structs.
https://halide-lang.org/ tackles a related problem. It decouples the math to be done from the access order so as to allow you to rapidly test looping over data in complicated ways to achieve cache-friendly access patterns for your specific hardware target without rewriting your whole core loop every time.
Halide is primarily an about image processing convolution kernels. I’m not sure how general purpose it can get.
-
Project mention: CI/CD Observability with OpenTelemetry Step by Step Guide | news.ycombinator.com | 2025-06-15
A child comment mentioned k8s but I also have been chomping at the bit to try out the eBPF hooks in https://github.com/pixie-io/pixie (or even https://github.com/coroot/coroot or https://github.com/parca-dev/parca ) all of which are Apache 2 licensed
The demo for https://github.com/draios/sysdig was also just amazing, but I don't have any idea what the storage requirements would be for leaving it running
-
> I suspect that I have an outdated version of hotspot Linux profiler, but I can’t just go and download a fresh release from GitHub, because hotspot is a KDE app, and I use NixOS.
KDE (not to be confused with the Plasma desktop) is just a bunch of C++ libraries that can work on a variety of desktop environments and even OSes (though Hotspot being a perf report alternative is clearly meant for use with Linux).
I just went and downloaded the latest CI build from[0] and it ran just fine on my openSUSE Tumbleweed, running Xorg with Window Maker. I do have a bunch of KDE apps installed, like Kate (my currently preferred text editor), Dolphin (the file manager i use whenever i want thumbnails, usually for videos and images), Spectacle (for screenshots), Falkon (i use it as a "clean" browser to test out things), etc so i also do have the KDE libraries on my system, but that is just a `zypper install` away. Or an `apt-get install` or `pacman -S` or whatever package manager your distro uses, i've used a bunch of them and they all pretty much behaved the same. I'd expect Hotspot to be installable in the same way in any of them.
If there are issues with NixOS (i don't know, i haven't tried it) i think it might actually be a NixOS issue and not a KDE issue.
[0] https://github.com/KDAB/hotspot/releases/tag/continuous
-
-
Sevalla
Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
-
-
Project mention: Strobelight: A profiling service built on open source technology | news.ycombinator.com | 2025-03-07
In Yandex, we have a similar profiler that supports native languages seamlessly with addition to Python/Java: https://github.com/yandex/perforator. Pretty exciting to see new profilers from big players!
-
Project mention: Ask HN: Would you pay for 100x faster TypeScript type checker | news.ycombinator.com | 2025-01-31
I'm currently evaluating whether it's worth to reboot TypeRunner [1] and bring enough value to the people. I'm not interested in open-sourcing it as it's a lot of work for a single person and I did/do already too much OSS.
This is contrast to existing solutions like SWC, which do not do any type checking, but just transpiling.
[1] https://github.com/marcj/TypeRunner
-
Project mention: Tracy: A real time, nanosecond resolution frame profiler | news.ycombinator.com | 2024-09-23
Does anybody have an opinion or comparison with respect to easy_profiler?
https://github.com/yse/easy_profiler
Especially interesting if based on real practical experience.
-
-
-
-
less_slow.cpp
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Thanks, appreciate the gesture :)
Traditional SWAR on GPUs is a fascinating topic. I've begun assembling a set of synthetic benchmarks to compare DP4A vs. DPX (<https://github.com/ashvardanian/less_slow.cpp/pull/35>), but it feels incomplete without SWAR. My working hypothesis is that 64-bit SWAR on properly aligned data could be very useful in GPGPU, though FMA/MIN/MAX operations in that PR might not be the clearest showcase of its strengths. Do you have a better example or use case in mind?
-
-
ada
WHATWG-compliant and fast URL parser written in modern C++, part of Internet Archive, Node.js, Clickhouse, Redpanda, Kong, Telegram, Adguard, Datadog and Cloudflare Workers.
-
CppServer
Ultra fast and low latency asynchronous socket server & client C++ library with support TCP, SSL, UDP, HTTP, HTTPS, WebSocket protocols and 10K connections problem solution
-
-
-
-
ultimatepp
U++ is a C++ cross-platform rapid application development framework focused on programmer's productivity. It includes a set of libraries (GUI, SQL, Network etc.), and integrated development environment (TheIDE).
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
C++ Performance discussion
C++ Performance related posts
-
Website Is Served from Nine Neovim Buffers on My Old ThinkPad
-
Arm Desktop: x86 Emulation
-
.NET: La Plataforma Ideal para Microservicios en 2025
-
Constrained languages are easier to optimize
-
Redesigned Swift.org is now live
-
Tracy profiler new release 0.12.0
-
Stop the Hack: Why Quick-and-Dirty Development Is Hurting Us All
-
A note from our sponsor - Sevalla
sevalla.com | 1 Sep 2025
Index
What are some of the best open-source Performance projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | {fmt} | 22,279 |
2 | tracy | 12,655 |
3 | FrameworkBenchmarks | 7,942 |
4 | Halide | 6,165 |
5 | pixie | 6,145 |
6 | hotspot | 4,824 |
7 | ArrayFire | 4,768 |
8 | oneDNN | 3,868 |
9 | perforator | 3,238 |
10 | TypeRunner | 2,627 |
11 | easy_profiler | 2,294 |
12 | palanteer | 2,172 |
13 | icinga2 | 2,117 |
14 | datatable | 1,871 |
15 | less_slow.cpp | 1,841 |
16 | Boost.Compute | 1,624 |
17 | ada | 1,581 |
18 | CppServer | 1,538 |
19 | CacheLib | 1,409 |
20 | nebula | 1,018 |
21 | speedb | 980 |
22 | ultimatepp | 920 |
23 | oneMath | 705 |