Hashtable

Top 23 Hashtable Open-Source Projects

  • algodeck

    An Open-Source Collection of 200+ Flash Cards to Help You Preparing Your Algorithms & Data Structures Interview đź’Ż

  • Klib

    A standalone and lightweight C library

  • Project mention: Factor is faster than Zig | news.ycombinator.com | 2023-11-10

    In my example the table stores the hash codes themselves instead of the keys (because the hash function is invertible)

    Oh, I see, right. If determining the home bucket is trivial, then the back-shifting method is great. The issue is just that it’s not as much of a general-purpose solution as it may initially seem.

    “With a different algorithm (Robin Hood or bidirectional linear probing), the load factor can be kept well over 90% with good performance, as the benchmarks in the same repo demonstrate.”

    I’ve seen the 90% claim made several times in literature on Robin Hood hash tables. In my experience, the claim is a bit exaggerated, although I suppose it depends on what our idea of “good performance” is. See these benchmarks, which again go up to a maximum load factor of 0.95 (Although boost and Absl forcibly grow/rehash at 0.85-0.9):

    https://strong-starlight-4ea0ed.netlify.app/

    Tsl, Martinus, and CC are all Robin Hood tables (https://github.com/Tessil/robin-map, https://github.com/martinus/robin-hood-hashing, and https://github.com/JacksonAllan/CC, respectively). Absl and Boost are the well-known SIMD-based hash tables. Khash (https://github.com/attractivechaos/klib/blob/master/khash.h) is, I think, an ordinary open-addressing table using quadratic probing. Fastmap is a new, yet-to-be-published design that is fundamentally similar to bytell (https://www.youtube.com/watch?v=M2fKMP47slQ) but also incorporates some aspects of the aforementioned SIMD maps (it caches a 4-bit fragment of the hash code to avoid most key comparisons).

    As you can see, all the Robin Hood maps spike upwards dramatically as the load factor gets high, becoming as much as 5-6 times slower at 0.95 vs 0.5 in one of the benchmarks (uint64_t key, 256-bit struct value: Total time to erase 1000 existing elements with N elements in map). Only the SIMD maps (with Boost being the better performer) and Fastmap appear mostly immune to load factor in all benchmarks, although the SIMD maps do - I believe - use tombstones for deletion.

    I’ve only read briefly about bi-directional linear probing – never experimented with it.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • dashmap

    Blazing fast concurrent HashMap for Rust.

  • Project mention: StupidAlloc: what if memory allocation was bad actually | /r/rust | 2023-07-07
  • Collections-C

    A library of generic data structures for the C language.

  • sc

    Common libraries and data structures for C.

  • Project mention: A simple hash table in C | news.ycombinator.com | 2023-06-13
  • indexmap

    A hash table with consistent order and fast iteration; access items by key or sequence index

  • Project mention: I just published my first crate: `identified_vec` - I would love some input! PR's are most welcome. | /r/learnrust | 2023-12-09

    You might want to check out how popular ecosystem crates do some of these things. Particularly relevant to you are probably crates providing collections, such as smallvec, hashbrown, or indexmap.

  • swiss

    Golang port of Abseil's SwissTable

  • Project mention: One Billion Row Challenge in Golang - From 95s to 1.96s | dev.to | 2024-03-17

    Time to deal with the large elephant in the room, the runtime.mapaccess2_fast64 map lookup. Despite spending some hours of research, I couldn't found any viable way to optimize the builtin map. However, there is a community alternative called Swiss Map, which sells itself as faster and more memory efficient than the builtin one. Replacing it is almost a drop-in, with just some syntax changes:

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • cuCollections

  • Project mention: GDlog: A GPU-Accelerated Deductive Engine | news.ycombinator.com | 2023-12-03

    https://github.com/topics/datalog?l=rust ... Cozo, Crepe

    Crepe: https://github.com/ekzhang/crepe :

    > Crepe is a library that allows you to write declarative logic programs in Rust, with a Datalog-like syntax. It provides a procedural macro that generates efficient, safe code and interoperates seamlessly with Rust programs.

    Looks like there's not yet a Python grammar for the treeedb tree-sitter: https://github.com/langston-barrett/treeedb :

    > Generate Soufflé Datalog types, relations, and facts that represent ASTs from a variety of programming languages.

    Looks like roxi supports n3, which adds `=>` "implies" to the Turtle lightweight RDF representation: https://github.com/pbonte/roxi

    FWIW rdflib/owl-rl: https://owl-rl.readthedocs.io/en/latest/owlrl.html :

    > simple forward chaining rules are used to extend (recursively) the incoming graph with all triples that the rule sets permit (ie, the “deductive closure” of the graph is computed).

    ForwardChainingStore and BackwardChainingStore implementations w/ rdflib in Python: https://github.com/RDFLib/FuXi/issues/15

    Fast CUDA hashmaps

    Gdlog is built on CuCollections.

    GPU HashMap libs to benchmark: Warpcore, CuCollections,

    https://github.com/NVIDIA/cuCollections

    https://github.com/NVIDIA/cccl

    https://github.com/sleeepyjack/warpcore

    /? Rocm HashMap

    DeMoriarty/DOKsparse:

  • data-structures

    Go datastructures. (by timtadh)

  • megahash

    A super-fast C++ hash table with Node.js wrapper, tested up to 1 billion keys.

  • ExcaliburHash

    Excalibur Hash is a high-speed hash map and hash set, ideal for performance-critical uses like video games

  • Project mention: Excalibur Hash is a high-speed hash map and hash set | news.ycombinator.com | 2024-01-10
  • HashMap

    An open addressing linear probing hash table, tuned for delete heavy workloads (by rigtorp)

  • ctl

    My variant of the C Template Library (by rurban)

  • Project mention: A header-only C implementation of C++ <algorithm> | news.ycombinator.com | 2023-07-03

    Well, I do like mine better, which is closer to the STL, and for all containers. https://github.com/rurban/ctl/

  • diskhash

    Diskbased (persistent) hashtable

  • rimbu

    Rimbu is a TypeScript library focused on immutable, performant, and type-safe collections and other tools.

  • seq

    The seq library is a collection of original C++14 STL-like containers and related tools (by Thermadiag)

  • Project mention: C++ Show and Tell - June 2023 | /r/cpp | 2023-06-01

    I have developped a library of original c++11 STL like containers. It took me a full year of homework... The adresse is https://github.com/Thermadiag/seq

  • hatrack

    Fast, multi-reader, multi-writer, lockless data structures for parallel programming

  • 42nd-at-threadmill

    A SIMD-accelerated concurrent hash table.

  • fph-table

    Flash Perfect Hash Table: an implementation of a dynamic perfect hash table, extremely fast for lookup

  • Harbol

    Harbol is a collection of data structure and miscellaneous libraries, similar in nature to C++'s Boost, STL, and GNOME's GLib but for C99+ (by assyrianic)

  • TableC

    Low-level C89 Closed/Open-addressed hashtable implementation.

  • redis-pydict

    A python dictionary that uses Redis as in-memory storage backend to facilitate distributed computing applications development.

  • hashtable-bench

    A benchmark for hash tables and hash functions in C++, evaluate on different data as comprehensively as possible

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Hashtable related posts

  • StupidAlloc: what if memory allocation was bad actually

    3 projects | /r/rust | 7 Jul 2023
  • A simple hash table in C

    7 projects | news.ycombinator.com | 13 Jun 2023
  • dashmap VS scalable-concurrent-containers - a user suggested alternative

    2 projects | 13 Apr 2023
  • Ultra-light hashtable

    1 project | /r/C_Programming | 2 Feb 2023
  • Light-weight closed-addressing hashtable

    1 project | /r/C_Programming | 2 Feb 2023
  • C_dictionary: A simple dynamically typed and sized hashmap in C - feedback welcome

    10 projects | /r/C_Programming | 23 Jan 2023
  • Samsara, a safe Rust concurrent cycle collector

    4 projects | /r/rust | 4 Jan 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 1 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Hashtable projects? This list will help you:

Project Stars
1 algodeck 5,385
2 Klib 4,021
3 dashmap 2,726
4 Collections-C 2,715
5 sc 2,165
6 indexmap 1,556
7 swiss 669
8 cuCollections 417
9 data-structures 415
10 megahash 401
11 ExcaliburHash 306
12 HashMap 190
13 ctl 162
14 diskhash 154
15 rimbu 124
16 seq 80
17 hatrack 79
18 42nd-at-threadmill 56
19 fph-table 37
20 Harbol 26
21 TableC 23
22 redis-pydict 19
23 hashtable-bench 12

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com