InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 C++ Performance Projects
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
This reminds me that I was recently using tracy to profile a program and found their own list of valid grievances, altough it's a bit more GNOME inflicted: https://github.com/wolfpld/tracy/issues/505#issuecomment-136...
-
Project mention: Archived: Popular backend frameworks by performance benchmark ranking in 2024 | dev.to | 2025-07-05
Since 2013, TechEmpower has established a backend framework benchmark. They meticulously define benchmark specifications and maintain an open-source approach that encourages contributions from the community. This benchmark has become a respected standard in the tech industry, serving as a reliable yardstick for technology competitors to assess the performance of their solutions (exemple Go Fiber, C# Asp.net, JS Just). So I can trust the Techempower benchmark.
-
Yep. We’re in a situation where C-like languages couple layout and access interface very tightly. But, now cache is such an overriding issue in optimization, you really want to rapidly experiment with different layouts without rewriting your whole algorithm every time. AOS, SOA, AOSOA, hot/cold data for different stages, etc…
Jon Blow’s Jai language famously added a feature to references that allowed you to easily experiment with moving data members between hot/cold arrays of structs.
https://halide-lang.org/ tackles a related problem. It decouples the math to be done from the access order so as to allow you to rapidly test looping over data in complicated ways to achieve cache-friendly access patterns for your specific hardware target without rewriting your whole core loop every time.
Halide is primarily an about image processing convolution kernels. I’m not sure how general purpose it can get.
-
Project mention: CI/CD Observability with OpenTelemetry Step by Step Guide | news.ycombinator.com | 2025-06-15
A child comment mentioned k8s but I also have been chomping at the bit to try out the eBPF hooks in https://github.com/pixie-io/pixie (or even https://github.com/coroot/coroot or https://github.com/parca-dev/parca ) all of which are Apache 2 licensed
The demo for https://github.com/draios/sysdig was also just amazing, but I don't have any idea what the storage requirements would be for leaving it running
-
> I suspect that I have an outdated version of hotspot Linux profiler, but I can’t just go and download a fresh release from GitHub, because hotspot is a KDE app, and I use NixOS.
KDE (not to be confused with the Plasma desktop) is just a bunch of C++ libraries that can work on a variety of desktop environments and even OSes (though Hotspot being a perf report alternative is clearly meant for use with Linux).
I just went and downloaded the latest CI build from[0] and it ran just fine on my openSUSE Tumbleweed, running Xorg with Window Maker. I do have a bunch of KDE apps installed, like Kate (my currently preferred text editor), Dolphin (the file manager i use whenever i want thumbnails, usually for videos and images), Spectacle (for screenshots), Falkon (i use it as a "clean" browser to test out things), etc so i also do have the KDE libraries on my system, but that is just a `zypper install` away. Or an `apt-get install` or `pacman -S` or whatever package manager your distro uses, i've used a bunch of them and they all pretty much behaved the same. I'd expect Hotspot to be installable in the same way in any of them.
If there are issues with NixOS (i don't know, i haven't tried it) i think it might actually be a NixOS issue and not a KDE issue.
[0] https://github.com/KDAB/hotspot/releases/tag/continuous
-
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
-
Project mention: Strobelight: A profiling service built on open source technology | news.ycombinator.com | 2025-03-07
In Yandex, we have a similar profiler that supports native languages seamlessly with addition to Python/Java: https://github.com/yandex/perforator. Pretty exciting to see new profilers from big players!
-
Project mention: Ask HN: Would you pay for 100x faster TypeScript type checker | news.ycombinator.com | 2025-01-31
I'm currently evaluating whether it's worth to reboot TypeRunner [1] and bring enough value to the people. I'm not interested in open-sourcing it as it's a lot of work for a single person and I did/do already too much OSS.
This is contrast to existing solutions like SWC, which do not do any type checking, but just transpiling.
[1] https://github.com/marcj/TypeRunner
-
Project mention: Tracy: A real time, nanosecond resolution frame profiler | news.ycombinator.com | 2024-09-23
Does anybody have an opinion or comparison with respect to easy_profiler?
https://github.com/yse/easy_profiler
Especially interesting if based on real practical experience.
-
-
-
-
less_slow.cpp
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Thanks, appreciate the gesture :)
Traditional SWAR on GPUs is a fascinating topic. I've begun assembling a set of synthetic benchmarks to compare DP4A vs. DPX (<https://github.com/ashvardanian/less_slow.cpp/pull/35>), but it feels incomplete without SWAR. My working hypothesis is that 64-bit SWAR on properly aligned data could be very useful in GPGPU, though FMA/MIN/MAX operations in that PR might not be the clearest showcase of its strengths. Do you have a better example or use case in mind?
-
-
ada
WHATWG-compliant and fast URL parser written in modern C++, part of Node.js, Clickhouse, Redpanda, Kong, Telegram, Datadog and Cloudflare Workers.
-
CppServer
Ultra fast and low latency asynchronous socket server & client C++ library with support TCP, SSL, UDP, HTTP, HTTPS, WebSocket protocols and 10K connections problem solution
-
-
-
-
ultimatepp
U++ is a C++ cross-platform rapid application development framework focused on programmer's productivity. It includes a set of libraries (GUI, SQL, Network etc.), and integrated development environment (TheIDE).
Have you ever tried U++? I haven't used it beyond quick and dirty testing but has a decent GUI builder and is a full C++ IDE.
https://www.ultimatepp.org/
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C++ Performance discussion
C++ Performance related posts
-
Redesigned Swift.org is now live
-
Tracy profiler new release 0.12.0
-
Stop the Hack: Why Quick-and-Dirty Development Is Hurting Us All
-
Hotspot: Linux `perf` GUI for performance analysis
-
Faster sorting with SIMD CUDA intrinsics
-
Ask HN: What is the most interesting thing you've learned lately?
-
Show HN: Less Slow C++: Revisiting Performance Tricks for C/C++/CUDA/Asm/PTX
-
A note from our sponsor - InfluxDB
www.influxdata.com | 14 Jul 2025
Index
What are some of the best open-source Performance projects in C++? This list will help you:
# | Project | Stars |
---|---|---|
1 | {fmt} | 22,025 |
2 | tracy | 12,233 |
3 | FrameworkBenchmarks | 7,896 |
4 | Halide | 6,121 |
5 | pixie | 6,081 |
6 | hotspot | 4,761 |
7 | ArrayFire | 4,736 |
8 | oneDNN | 3,829 |
9 | perforator | 3,225 |
10 | TypeRunner | 2,625 |
11 | easy_profiler | 2,288 |
12 | palanteer | 2,154 |
13 | icinga2 | 2,105 |
14 | datatable | 1,867 |
15 | less_slow.cpp | 1,810 |
16 | Boost.Compute | 1,615 |
17 | ada | 1,558 |
18 | CppServer | 1,538 |
19 | CacheLib | 1,369 |
20 | nebula | 1,002 |
21 | speedb | 980 |
22 | ultimatepp | 902 |
23 | oneMath | 697 |