Experiences with Concurrent Hash Map Libraries

This page summarizes the projects mentioned and recommended in the original post on /r/cpp

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • parallel-hashmap

    A family of header-only, very fast and memory-friendly hashmap and btree containers.

  • I'm the author of parallel-hashmap. There are ways to do what you suggest either lock-free, or with minimal locking. If you have a test program for your use case I'd be happy to adapt it for using phmap.

  • libcuckoo

    A high-performance, concurrent hash table

  • libcuckoo replaced junction as my concurrent hash map and allowed me to get rid of my pointer longevity management and I saw no decrease in performance. No commits in over a year I think the author of parallel-hashmap made a good point here where he pointed out that it's only worth trying the more experimental hash maps like junction or growt when hash map access is actually the bottleneck. In my case the performance of libcuckoo was not a bottleneck, so I saw no difference in performance compared to the use of junction.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Folly

    An open-source C++ library developed and used at Facebook.

  • folly's AtomicHashMap requires knowing the approximate number of elements up-front and the space for erased elements can never be reclaimed. This doesn't work well for our application.

  • growt

    This is a header only library offering a variety of dynamically growing concurrent hash tables. That all work by dynamically migrating the current table once it gets too full.

  • growt shows impressive benchmark results in this paper compared to folly, TBB, junction, and libcuckoo. However, it was not in good shape to be used as a production dependency. I had several issues and compilation errors here, here, and here.

  • junction

    Concurrent data structures in C++

  • junction has a very impressive performance benchmark here. Initially it worked for my application, but I ran into some issues: Only raw pointers are supported as either keys or values. This means I am responsible for memory management and it was a pain. junction's required dependency "turf" causes linker errors when compiling with -fsanitize=address because there are symbol name collisions. Every thread that accesses the hash map must periodically call an update function or memory will be leaked. No commits in over three years, GitHub issues aren't getting any attention. The author said it's experimental and he doesn't want it to become more popular

  • FASTER

    Fast persistent recoverable log and key-value store + cache, in C# and C++.

  • you could use fasterkv https://github.com/microsoft/FASTER

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts