C Hashtable

Open-source C projects categorized as Hashtable

Top 10 C Hashtable Projects

  • Klib

    A standalone and lightweight C library

    Project mention: Factor is faster than Zig | news.ycombinator.com | 2023-11-10

    In my example the table stores the hash codes themselves instead of the keys (because the hash function is invertible)

    Oh, I see, right. If determining the home bucket is trivial, then the back-shifting method is great. The issue is just that it’s not as much of a general-purpose solution as it may initially seem.

    “With a different algorithm (Robin Hood or bidirectional linear probing), the load factor can be kept well over 90% with good performance, as the benchmarks in the same repo demonstrate.”

    I’ve seen the 90% claim made several times in literature on Robin Hood hash tables. In my experience, the claim is a bit exaggerated, although I suppose it depends on what our idea of “good performance” is. See these benchmarks, which again go up to a maximum load factor of 0.95 (Although boost and Absl forcibly grow/rehash at 0.85-0.9):

    https://strong-starlight-4ea0ed.netlify.app/

    Tsl, Martinus, and CC are all Robin Hood tables (https://github.com/Tessil/robin-map, https://github.com/martinus/robin-hood-hashing, and https://github.com/JacksonAllan/CC, respectively). Absl and Boost are the well-known SIMD-based hash tables. Khash (https://github.com/attractivechaos/klib/blob/master/khash.h) is, I think, an ordinary open-addressing table using quadratic probing. Fastmap is a new, yet-to-be-published design that is fundamentally similar to bytell (https://www.youtube.com/watch?v=M2fKMP47slQ) but also incorporates some aspects of the aforementioned SIMD maps (it caches a 4-bit fragment of the hash code to avoid most key comparisons).

    As you can see, all the Robin Hood maps spike upwards dramatically as the load factor gets high, becoming as much as 5-6 times slower at 0.95 vs 0.5 in one of the benchmarks (uint64_t key, 256-bit struct value: Total time to erase 1000 existing elements with N elements in map). Only the SIMD maps (with Boost being the better performer) and Fastmap appear mostly immune to load factor in all benchmarks, although the SIMD maps do - I believe - use tombstones for deletion.

    I’ve only read briefly about bi-directional linear probing – never experimented with it.

  • InfluxDB

    Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.

    InfluxDB logo
  • Collections-C

    A library of generic data structures for the C language.

  • sc

    Common libraries and data structures for C.

  • hashmap.c

    Hash map implementation in C.

  • ctl

    My variant of the C Template Library (by rurban)

  • diskhash

    Diskbased (persistent) hashtable

  • hatrack

    Fast, multi-reader, multi-writer, lockless data structures for parallel programming

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • Harbol

    Harbol is a collection of data structure and miscellaneous libraries, similar in nature to C++'s Boost, STL, and GNOME's GLib but for C99+ (by assyrianic)

  • TableC

    Low-level C89 Closed/Open-addressed hashtable implementation.

  • Generic-C-DataStructures

    A repository for code I wrote while learning to implement generic data structures in C

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C Hashtable discussion

Log in or Post with

C Hashtable related posts

Index

What are some of the best open-source Hashtable projects in C? This list will help you:

Project Stars
1 Klib 4,121
2 Collections-C 2,801
3 sc 2,234
4 hashmap.c 814
5 ctl 167
6 diskhash 156
7 hatrack 81
8 Harbol 26
9 TableC 24
10 Generic-C-DataStructures 1

Sponsored
Purpose built for real-time analytics at any scale.
InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.
www.influxdata.com

Did you konow that C is
the 7th most popular programming language
based on number of metions?