SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Hashtable Open-Source Projects
-
algodeck
An Open-Source Collection of 200+ Flash Cards to Help You Preparing Your Algorithms & Data Structures Interview đź’Ż
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
indexmap
A hash table with consistent order and fast iteration; access items by key or sequence index
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
ExcaliburHash
Excalibur Hash is a high-speed hash map and hash set, ideal for performance-critical uses like video games
-
rimbu
Rimbu is a TypeScript library focused on immutable, performant, and type-safe collections and other tools.
-
seq
The seq library is a collection of original C++14 STL-like containers and related tools (by Thermadiag)
-
fph-table
Flash Perfect Hash Table: an implementation of a dynamic perfect hash table, extremely fast for lookup
-
Harbol
Harbol is a collection of data structure and miscellaneous libraries, similar in nature to C++'s Boost, STL, and GNOME's GLib but for C99+ (by assyrianic)
-
redis-pydict
A python dictionary that uses Redis as in-memory storage backend to facilitate distributed computing applications development.
-
hashtable-bench
A benchmark for hash tables and hash functions in C++, evaluate on different data as comprehensively as possible
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
In my example the table stores the hash codes themselves instead of the keys (because the hash function is invertible)
Oh, I see, right. If determining the home bucket is trivial, then the back-shifting method is great. The issue is just that it’s not as much of a general-purpose solution as it may initially seem.
“With a different algorithm (Robin Hood or bidirectional linear probing), the load factor can be kept well over 90% with good performance, as the benchmarks in the same repo demonstrate.”
I’ve seen the 90% claim made several times in literature on Robin Hood hash tables. In my experience, the claim is a bit exaggerated, although I suppose it depends on what our idea of “good performance” is. See these benchmarks, which again go up to a maximum load factor of 0.95 (Although boost and Absl forcibly grow/rehash at 0.85-0.9):
https://strong-starlight-4ea0ed.netlify.app/
Tsl, Martinus, and CC are all Robin Hood tables (https://github.com/Tessil/robin-map, https://github.com/martinus/robin-hood-hashing, and https://github.com/JacksonAllan/CC, respectively). Absl and Boost are the well-known SIMD-based hash tables. Khash (https://github.com/attractivechaos/klib/blob/master/khash.h) is, I think, an ordinary open-addressing table using quadratic probing. Fastmap is a new, yet-to-be-published design that is fundamentally similar to bytell (https://www.youtube.com/watch?v=M2fKMP47slQ) but also incorporates some aspects of the aforementioned SIMD maps (it caches a 4-bit fragment of the hash code to avoid most key comparisons).
As you can see, all the Robin Hood maps spike upwards dramatically as the load factor gets high, becoming as much as 5-6 times slower at 0.95 vs 0.5 in one of the benchmarks (uint64_t key, 256-bit struct value: Total time to erase 1000 existing elements with N elements in map). Only the SIMD maps (with Boost being the better performer) and Fastmap appear mostly immune to load factor in all benchmarks, although the SIMD maps do - I believe - use tombstones for deletion.
I’ve only read briefly about bi-directional linear probing – never experimented with it.
Project mention: I just published my first crate: `identified_vec` - I would love some input! PR's are most welcome. | /r/learnrust | 2023-12-09You might want to check out how popular ecosystem crates do some of these things. Particularly relevant to you are probably crates providing collections, such as smallvec, hashbrown, or indexmap.
Time to deal with the large elephant in the room, the runtime.mapaccess2_fast64 map lookup. Despite spending some hours of research, I couldn't found any viable way to optimize the builtin map. However, there is a community alternative called Swiss Map, which sells itself as faster and more memory efficient than the builtin one. Replacing it is almost a drop-in, with just some syntax changes:
https://github.com/topics/datalog?l=rust ... Cozo, Crepe
Crepe: https://github.com/ekzhang/crepe :
> Crepe is a library that allows you to write declarative logic programs in Rust, with a Datalog-like syntax. It provides a procedural macro that generates efficient, safe code and interoperates seamlessly with Rust programs.
Looks like there's not yet a Python grammar for the treeedb tree-sitter: https://github.com/langston-barrett/treeedb :
> Generate Soufflé Datalog types, relations, and facts that represent ASTs from a variety of programming languages.
Looks like roxi supports n3, which adds `=>` "implies" to the Turtle lightweight RDF representation: https://github.com/pbonte/roxi
FWIW rdflib/owl-rl: https://owl-rl.readthedocs.io/en/latest/owlrl.html :
> simple forward chaining rules are used to extend (recursively) the incoming graph with all triples that the rule sets permit (ie, the “deductive closure” of the graph is computed).
ForwardChainingStore and BackwardChainingStore implementations w/ rdflib in Python: https://github.com/RDFLib/FuXi/issues/15
Fast CUDA hashmaps
Gdlog is built on CuCollections.
GPU HashMap libs to benchmark: Warpcore, CuCollections,
https://github.com/NVIDIA/cuCollections
https://github.com/NVIDIA/cccl
https://github.com/sleeepyjack/warpcore
/? Rocm HashMap
DeMoriarty/DOKsparse:
Project mention: Excalibur Hash is a high-speed hash map and hash set | news.ycombinator.com | 2024-01-10
Project mention: A header-only C implementation of C++ <algorithm> | news.ycombinator.com | 2023-07-03Well, I do like mine better, which is closer to the STL, and for all containers. https://github.com/rurban/ctl/
I have developped a library of original c++11 STL like containers. It took me a full year of homework... The adresse is https://github.com/Thermadiag/seq
Hashtable related posts
-
StupidAlloc: what if memory allocation was bad actually
-
A simple hash table in C
-
dashmap VS scalable-concurrent-containers - a user suggested alternative
2 projects | 13 Apr 2023 -
Ultra-light hashtable
-
Light-weight closed-addressing hashtable
-
C_dictionary: A simple dynamically typed and sized hashmap in C - feedback welcome
-
Samsara, a safe Rust concurrent cycle collector
-
A note from our sponsor - SaaSHub
www.saashub.com | 1 May 2024
Index
What are some of the best open-source Hashtable projects? This list will help you:
Project | Stars | |
---|---|---|
1 | algodeck | 5,385 |
2 | Klib | 4,021 |
3 | dashmap | 2,726 |
4 | Collections-C | 2,715 |
5 | sc | 2,165 |
6 | indexmap | 1,556 |
7 | swiss | 669 |
8 | cuCollections | 417 |
9 | data-structures | 415 |
10 | megahash | 401 |
11 | ExcaliburHash | 306 |
12 | HashMap | 190 |
13 | ctl | 162 |
14 | diskhash | 154 |
15 | rimbu | 124 |
16 | seq | 80 |
17 | hatrack | 79 |
18 | 42nd-at-threadmill | 56 |
19 | fph-table | 37 |
20 | Harbol | 26 |
21 | TableC | 23 |
22 | redis-pydict | 19 |
23 | hashtable-bench | 12 |
Sponsored