tracy
parallel-hashmap
Our great sponsors
tracy | parallel-hashmap | |
---|---|---|
57 | 31 | |
7,762 | 2,307 | |
- | - | |
9.6 | 7.6 | |
2 days ago | 14 days ago | |
C++ | C++ | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tracy
- Tracy: Real-time nanosecond resolution frame profiler
-
Google/orbit – C/C++ Performance Profiler
i don't really think there is _anything_ that comes even close to tracy https://github.com/wolfpld/tracy.
on top of this, given google's penchant for dumping projects aka abandonware, this would be an easy pass.
-
Immediate Mode GUI Programming
The RemedyBG debugger (https://remedybg.handmade.network/) and the Tracy profiler (https://github.com/wolfpld/tracy) both use Dear ImGui and so far I've only read high praise from people who used those tools compared to the 'established' alternatives.
For tools like this, programmers are also just "normal users", and from the developer side, I'm sure they evaluated various alternatives with all their pros and cons before settling for Dear ImGui.
- Tracy Profiler
-
Tuning Linux for Performance
Not the person you asked, but generally you might want to look at "frame-based" profilers. These are typically used in video games, but the concept is general, and can apply to other applications. The "frame" could also be something like a request or transaction being processed. I like Tracy[1], myself.
Another latency metric that you'll see, often w/respect to web apps and microservices is "P99" and similar. This is the amount of time in which 99% of requests get their response. For a higher percentile, you get a better idea of worst-case performance.
-
What is your favourite profiling tool for C++?
I've not actually used Superluminal, but I use Tracy for similar reasons. It's free though (and, importantly, open source).
-
My first game engine
For profiling, you can check tracy.
-
I got my procedural city engine / game (built from scratch in c++) running on the steam deck - does it look too garish?
You could try Tracy
-
Sharing Saturday #462
There is no such thing as overengineering in fun projects, so I've also adopted Tracy as profiling solution. Works quite nice and gonna save me plenty of times in the future debugging performance spikes on badly optimized math heavy operations.
-
Debugging and profiling embedded applications.
I know about tools such as tracing, jaeger or tracy. While having a complete tracing could be a potential solution, these tools don't work with no_std.
parallel-hashmap
-
The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
Standard library maps/unordered_maps are themselves notoriously slow anyway. A sparse_hash_map from abseil or parallel-hashmaps[1] would be better.
-
My own Concurrent Hash Map picks
Cool! Looking forward to you trying my phmap - and please let me know if you have any question.
-
Boost 1.81 will have boost::unordered_flat_map...
I do this as well in my phmap and gtl implementations. It makes the tables look worse in benchmarks like the above, but prevents really bad surprises occasionally.
-
Comprehensive C++ Hashmap Benchmarks 2022
Thanks a lot for the great benchmark, Martin. Glad you used different hash functions, because I do sacrifice some speed to make sure that the performance of my hash maps doesn't degrade drastically with poor hash functions. Happy to see that my phmap and gtl (the C++20 version) performed well.
-
Can C++ maps be as efficient as Python dictionaries ?
I use https://github.com/greg7mdp/parallel-hashmap when I need better performance of maps and sets.
-
How to build a Chess Engine, an interactive guide
Then they should really try https://github.com/greg7mdp/parallel-hashmap, the current state of the art.
-
boost::unordered map is a new king of data structures
Unordered hash map shootout CMAP = https://github.com/tylov/STC KMAP = https://github.com/attractivechaos/klib PMAP = https://github.com/greg7mdp/parallel-hashmap FMAP = https://github.com/skarupke/flat_hash_map RMAP = https://github.com/martinus/robin-hood-hashing HMAP = https://github.com/Tessil/hopscotch-map TMAP = https://github.com/Tessil/robin-map UMAP = std::unordered_map Usage: shootout [n-million=40 key-bits=25] Random keys are in range [0, 2^25). Seed = 1656617916: T1: Insert/update random keys: KMAP: time: 1.949, size: 15064129, buckets: 33554432, sum: 165525449561381 CMAP: time: 1.649, size: 15064129, buckets: 22145833, sum: 165525449561381 PMAP: time: 2.434, size: 15064129, buckets: 33554431, sum: 165525449561381 FMAP: time: 2.112, size: 15064129, buckets: 33554432, sum: 165525449561381 RMAP: time: 1.708, size: 15064129, buckets: 33554431, sum: 165525449561381 HMAP: time: 2.054, size: 15064129, buckets: 33554432, sum: 165525449561381 TMAP: time: 1.645, size: 15064129, buckets: 33554432, sum: 165525449561381 UMAP: time: 6.313, size: 15064129, buckets: 31160981, sum: 165525449561381 T2: Insert sequential keys, then remove them in same order: KMAP: time: 1.173, size: 0, buckets: 33554432, erased 20000000 CMAP: time: 1.651, size: 0, buckets: 33218751, erased 20000000 PMAP: time: 3.840, size: 0, buckets: 33554431, erased 20000000 FMAP: time: 1.722, size: 0, buckets: 33554432, erased 20000000 RMAP: time: 2.359, size: 0, buckets: 33554431, erased 20000000 HMAP: time: 0.849, size: 0, buckets: 33554432, erased 20000000 TMAP: time: 0.660, size: 0, buckets: 33554432, erased 20000000 UMAP: time: 2.138, size: 0, buckets: 31160981, erased 20000000 T3: Remove random keys: KMAP: time: 1.973, size: 0, buckets: 33554432, erased 23367671 CMAP: time: 2.020, size: 0, buckets: 33218751, erased 23367671 PMAP: time: 2.940, size: 0, buckets: 33554431, erased 23367671 FMAP: time: 1.147, size: 0, buckets: 33554432, erased 23367671 RMAP: time: 1.941, size: 0, buckets: 33554431, erased 23367671 HMAP: time: 1.135, size: 0, buckets: 33554432, erased 23367671 TMAP: time: 1.064, size: 0, buckets: 33554432, erased 23367671 UMAP: time: 5.632, size: 0, buckets: 31160981, erased 23367671 T4: Iterate random keys: KMAP: time: 0.748, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 CMAP: time: 0.627, size: 23367671, buckets: 33218751, repeats: 8, sum: 4465059465719680 PMAP: time: 0.680, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 FMAP: time: 0.735, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 RMAP: time: 0.464, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 HMAP: time: 0.719, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 TMAP: time: 0.662, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 UMAP: time: 6.168, size: 23367671, buckets: 31160981, repeats: 8, sum: 4465059465719680 T5: Lookup random keys: KMAP: time: 0.943, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 CMAP: time: 0.863, size: 23367671, buckets: 33218751, lookups: 34235332, found: 29040438 PMAP: time: 1.635, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 FMAP: time: 0.969, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 RMAP: time: 1.705, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 HMAP: time: 0.712, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 TMAP: time: 0.584, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 UMAP: time: 1.974, size: 23367671, buckets: 31160981, lookups: 34235332, found: 29040438
-
Is A* just always slow?
std::unordered_map is notorious for being slow. Use a better implementation (I like the flat naps from here, which are the same as abseil’s). The question that needs to be asked too is if you need to use a map.
-
New Boost.Unordered containers have BIG improvements!
A comparison against phmap would also be nice.
-
How to implement static typing in a C++ bytecode VM?
std::unordered_map is perfectly fine. You can do better with external libraries, like parallel hashmap, but these tend to be drop-in replacements
What are some alternatives?
optick - C++ Profiler For Games
Folly - An open-source C++ library developed and used at Facebook.
orbit - C/C++ Performance Profiler
robin-hood-hashing - Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
palanteer - Visual Python and C++ nanosecond profiler, logger, tests enabler
libcuckoo - A high-performance, concurrent hash table
pprof - pprof is a tool for visualization and analysis of profiling data
rust-phf - Compile time static maps for Rust
STL - MSVC's implementation of the C++ Standard Library.
flat_hash_map - A very fast hashtable
gperftools - Main gperftools repository
FASTER - Fast persistent recoverable log and key-value store + cache, in C# and C++.