robin-hood-hashing vs flat_hash_map

robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20 (by martinus)

DISCONTINUED

Suggest alternative

Edit details

flat_hash_map

A very fast hashtable (by skarupke)

Suggest topics

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

robin-hood-hashing		flat_hash_map
	Project
23	Mentions	10
1,465	Stars	1,677
-	Growth	-
0.0	Activity	0.0
12 months ago	Latest Commit	7 months ago
C++	Language	C++
MIT License	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

robin-hood-hashing

Posts with mentions or reviews of robin-hood-hashing. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-10.

Factor is faster than Zig
11 projects | news.ycombinator.com | 10 Nov 2023

In my example the table stores the hash codes themselves instead of the keys (because the hash function is invertible)
Oh, I see, right. If determining the home bucket is trivial, then the back-shifting method is great. The issue is just that it’s not as much of a general-purpose solution as it may initially seem.
“With a different algorithm (Robin Hood or bidirectional linear probing), the load factor can be kept well over 90% with good performance, as the benchmarks in the same repo demonstrate.”
I’ve seen the 90% claim made several times in literature on Robin Hood hash tables. In my experience, the claim is a bit exaggerated, although I suppose it depends on what our idea of “good performance” is. See these benchmarks, which again go up to a maximum load factor of 0.95 (Although boost and Absl forcibly grow/rehash at 0.85-0.9):
https://strong-starlight-4ea0ed.netlify.app/
Tsl, Martinus, and CC are all Robin Hood tables (https://github.com/Tessil/robin-map, https://github.com/martinus/robin-hood-hashing, and https://github.com/JacksonAllan/CC, respectively). Absl and Boost are the well-known SIMD-based hash tables. Khash (https://github.com/attractivechaos/klib/blob/master/khash.h) is, I think, an ordinary open-addressing table using quadratic probing. Fastmap is a new, yet-to-be-published design that is fundamentally similar to bytell (https://www.youtube.com/watch?v=M2fKMP47slQ) but also incorporates some aspects of the aforementioned SIMD maps (it caches a 4-bit fragment of the hash code to avoid most key comparisons).
As you can see, all the Robin Hood maps spike upwards dramatically as the load factor gets high, becoming as much as 5-6 times slower at 0.95 vs 0.5 in one of the benchmarks (uint64_t key, 256-bit struct value: Total time to erase 1000 existing elements with N elements in map). Only the SIMD maps (with Boost being the better performer) and Fastmap appear mostly immune to load factor in all benchmarks, although the SIMD maps do - I believe - use tombstones for deletion.
I’ve only read briefly about bi-directional linear probing – never experimented with it.
If this isn't the perfect data structure, why?
3 projects | /r/C_Programming | 22 Oct 2023

From your other comments, it seems like your knowledge of hash tables might be limited to closed-addressing/separate-chaining hash tables. The current frontrunners in high-performance, memory-efficient hash table design all use some form of open addressing, largely to avoid pointer chasing and limit cache misses. In this regard, you want to check our SSE-powered hash tables (such as Abseil, Boost, and Folly/F14), Robin Hood hash tables (such as Martinus and Tessil), or Skarupke (I've recently had a lot of success with a similar design that I will publish here soon and is destined to replace my own Robin Hood hash tables). Also check out existing research/benchmarks here and here. But we a little bit wary of any benchmarks you look at or perform because there are a lot of factors that influence the result (e.g. benchmarking hash tables at a maximum load factor of 0.5 will produce wildly different result to benchmarking them at a load factor of 0.95, just as benchmarking them with integer keys-value pairs will produce different results to benchmarking them with 256-byte key-value pairs). And you need to familiarize yourself with open addressing and different probing strategies (e.g. linear, quadratic) first.
boost::unordered standalone
3 projects | /r/cpp | 9 Jul 2023

Also, FYI there is robin_hood::unordered_{map,set} which has very high performance, and is header-only and standalone.
Solving “Two Sum” in C with a tiny hash table
1 project | news.ycombinator.com | 29 Jun 2023

std::unordered_map is notoriously slow, several times slower than a "proper" hashmap implementation like Google's absl or Martin's robin-hood-hashing [1]. That said, std::sort is not the fastest sort implementation, either. It is hard to say which will win.
[1]: https://github.com/martinus/robin-hood-hashing
Convenient Containers v1.0.3: Better compile speed, faster maps and sets
4 projects | /r/C_Programming | 3 May 2023

The main advantage of the latest version is that it reduces build time by about 53% (GCC 12.1), based on the comprehensive test suit found in unit_tests.c. This improvement is significant because compile time was previously a drawback of this library, with maps and sets—in particular—compiling slower than their C++ template-based counterparts. I achieved it by refactoring the library to do less work inside API macros and, in particular, use fewer _Generic statements, which seem to be a compile-speed bottleneck. A nice side effect of the refactor is that the library can now more easily be extended with the planned dynamic strings and ordered maps and sets. The other major improvement concerns the performance of maps and sets. Here are some interactive benchmarks[1] comparing CC’s maps to two popular implementations of Robin Hood hash maps in C++ (as well as std::unordered_map as a baseline). They show that CC maps perform roughly on par with those implementations.
Effortless Performance Improvements in C++: std:unordered_map
4 projects | news.ycombinator.com | 2 Mar 2023

For anyone in a situation where a set/map (or unordered versions) is in a hot part of the code, I'd also highly recommend Robin Hood: https://github.com/martinus/robin-hood-hashing
It made a huge difference in one of the programs I was running.
Inside boost::unordered_flat_map
11 projects | /r/cpp | 18 Nov 2022
What are some cool modern libraries you enjoy using?
32 projects | /r/cpp | 18 Sep 2022

Oh my bad. Still thought -- your name.. it looks very familiar to me. Are you the robin_hood hashing guy perhaps? Yes you are! My bad -- https://github.com/martinus/robin-hood-hashing.
Performance comparison: counting words in Python, C/C++, Awk, Rust, and more
12 projects | news.ycombinator.com | 24 Jul 2022
Got a bit better C++ version here which uses a couple libraries instead of std:: stuff - https://gist.github.com/jcelerier/74dfd473bccec8f1bd5d78be5a... ; boost, fmt and https://github.com/martinus/robin-hood-hashing
```
    $ g++ -I robin-hood-hashing/src/include -O2 -flto -std=c++20 -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -lfmt
```
A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
5 projects | /r/cpp | 4 Jul 2022

The implementation is mostly inspired by this comment and lessons learned from my older robin-hood-hashing hashmap.

flat_hash_map

Posts with mentions or reviews of flat_hash_map. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-02.

Effortless Performance Improvements in C++: std:unordered_map
4 projects | news.ycombinator.com | 2 Mar 2023

If you don't need all the guarantees provided by std::unordered_map (pointer stability is usually the big one), you can go a /lot/ faster with a map that uses open addressing.
Some of my favorite alternative hash map implementations are ska::flat_hash_map and ska::bytell_hash_map from https://github.com/skarupke/flat_hash_map. They're fast, and the single header implementation makes them easy to add to a project. For my use cases they generally offer similar performance to abseil and folly F14.
Don't be fooled by the fact that they haven't been updated in ~5 years. I've been using them for nearly that long and have yet to find any bugs.
Inside boost::unordered_flat_map
11 projects | /r/cpp | 18 Nov 2022
A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
5 projects | /r/cpp | 4 Jul 2022

When int64 is the key, then the winner remains the unorder map from Malte Skarupke if (and only if) associated with a custom allocator.
boost::unordered map is a new king of data structures
10 projects | /r/cpp | 30 Jun 2022

Unordered hash map shootout CMAP = https://github.com/tylov/STC KMAP = https://github.com/attractivechaos/klib PMAP = https://github.com/greg7mdp/parallel-hashmap FMAP = https://github.com/skarupke/flat_hash_map RMAP = https://github.com/martinus/robin-hood-hashing HMAP = https://github.com/Tessil/hopscotch-map TMAP = https://github.com/Tessil/robin-map UMAP = std::unordered_map Usage: shootout [n-million=40 key-bits=25] Random keys are in range [0, 2^25). Seed = 1656617916: T1: Insert/update random keys: KMAP: time: 1.949, size: 15064129, buckets: 33554432, sum: 165525449561381 CMAP: time: 1.649, size: 15064129, buckets: 22145833, sum: 165525449561381 PMAP: time: 2.434, size: 15064129, buckets: 33554431, sum: 165525449561381 FMAP: time: 2.112, size: 15064129, buckets: 33554432, sum: 165525449561381 RMAP: time: 1.708, size: 15064129, buckets: 33554431, sum: 165525449561381 HMAP: time: 2.054, size: 15064129, buckets: 33554432, sum: 165525449561381 TMAP: time: 1.645, size: 15064129, buckets: 33554432, sum: 165525449561381 UMAP: time: 6.313, size: 15064129, buckets: 31160981, sum: 165525449561381 T2: Insert sequential keys, then remove them in same order: KMAP: time: 1.173, size: 0, buckets: 33554432, erased 20000000 CMAP: time: 1.651, size: 0, buckets: 33218751, erased 20000000 PMAP: time: 3.840, size: 0, buckets: 33554431, erased 20000000 FMAP: time: 1.722, size: 0, buckets: 33554432, erased 20000000 RMAP: time: 2.359, size: 0, buckets: 33554431, erased 20000000 HMAP: time: 0.849, size: 0, buckets: 33554432, erased 20000000 TMAP: time: 0.660, size: 0, buckets: 33554432, erased 20000000 UMAP: time: 2.138, size: 0, buckets: 31160981, erased 20000000 T3: Remove random keys: KMAP: time: 1.973, size: 0, buckets: 33554432, erased 23367671 CMAP: time: 2.020, size: 0, buckets: 33218751, erased 23367671 PMAP: time: 2.940, size: 0, buckets: 33554431, erased 23367671 FMAP: time: 1.147, size: 0, buckets: 33554432, erased 23367671 RMAP: time: 1.941, size: 0, buckets: 33554431, erased 23367671 HMAP: time: 1.135, size: 0, buckets: 33554432, erased 23367671 TMAP: time: 1.064, size: 0, buckets: 33554432, erased 23367671 UMAP: time: 5.632, size: 0, buckets: 31160981, erased 23367671 T4: Iterate random keys: KMAP: time: 0.748, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 CMAP: time: 0.627, size: 23367671, buckets: 33218751, repeats: 8, sum: 4465059465719680 PMAP: time: 0.680, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 FMAP: time: 0.735, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 RMAP: time: 0.464, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 HMAP: time: 0.719, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 TMAP: time: 0.662, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 UMAP: time: 6.168, size: 23367671, buckets: 31160981, repeats: 8, sum: 4465059465719680 T5: Lookup random keys: KMAP: time: 0.943, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 CMAP: time: 0.863, size: 23367671, buckets: 33218751, lookups: 34235332, found: 29040438 PMAP: time: 1.635, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 FMAP: time: 0.969, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 RMAP: time: 1.705, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 HMAP: time: 0.712, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 TMAP: time: 0.584, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 UMAP: time: 1.974, size: 23367671, buckets: 31160981, lookups: 34235332, found: 29040438
Updating map_benchmarks: Send your hashmaps!
13 projects | /r/cpp | 16 Jun 2022

I believe that when the number of elements is larger than 4 (a rough estimation), the associative linear table won't be faster than ska::flat_hash_map or fph-table with the identity hash function. If you look at the benchmark results, you will find that the average lookup time may well be less than 2 nanoseconds when item number is smaller than one thousand on morden CPUs. For these two hash tables, there are only about ten instructions in the critical path of lookup. And this should be faster than the linear search in a associative table, where there are a lot of branches and comparing instructions. However, you should benchmark it youself to get the real conclusion. This is just a simple analysis on paper from mine. By the way, the associative table can be faster if it is implemented with hardware circuits or SIMD instructions.
Will std::set and std::unordered_set implement extra optimizations for low-number of expected unique values with too many duplicates?
2 projects | /r/cpp | 27 Mar 2022

I have done similar benchmarks (with a lot of care) a few months ago, but the results were different than yours when using a very fast hash map. I was surprised that even for small maps, flat_hash_map was faster than searching small arrays (searching int64_t).
A truly fast Map implementation?
3 projects | /r/cpp | 24 Aug 2021

You should look for flat-hash-maps. This is a good implementation, skarupke/flat_hash_map. The author also has a talk about the implementation at one of the boost conferences on youtube.
Dolphin Emulator - Dolphin MEGA Progress Report: April and May 2021
1 project | /r/programming | 6 Jun 2021

You may want to give a try to Skarupke's HashMaps.
Fast insert-only hash map
3 projects | /r/cpp | 9 Mar 2021

You can either use Abseil's Swiss Table, or Facebook's F14, or Skarupke's flat_hash_map.
C++: How a simple question helped me form a New Year's Resolution
1 project | /r/cpp | 4 Jan 2021

The state of the art for hash-based containers would be either Abseil's or Skarupke's.

What are some alternatives?

When comparing robin-hood-hashing and flat_hash_map you can also consider the following projects:

parallel-hashmap - A family of header-only, very fast and memory-friendly hashmap and btree containers.

STL - MSVC's implementation of the C++ Standard Library.

unordered - Boost.org unordered module

robin-map - C++ implementation of a fast hash map and hash set using robin hood hashing

dense_hash_map - A simple replacement for std::unordered_map

xxHash - Extremely fast non-cryptographic hash algorithm

C++ Format - A modern formatting library

unordered_dense - A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion

tracy - Frame profiler

sparsepp - A fast, memory efficient hash map for C++

robin-hood-hashing vs parallel-hashmap flat_hash_map vs parallel-hashmap robin-hood-hashing vs STL flat_hash_map vs unordered robin-hood-hashing vs robin-map flat_hash_map vs dense_hash_map robin-hood-hashing vs xxHash flat_hash_map vs robin-map robin-hood-hashing vs C++ Format flat_hash_map vs unordered_dense robin-hood-hashing vs tracy flat_hash_map vs sparsepp

Compare robin-hood-hashing vs flat_hash_map and see what are their differences.

robin-hood-hashing

flat_hash_map

robin-hood-hashing

flat_hash_map

What are some alternatives?