C datastructure

Open-source C projects categorized as datastructure

C datastructure Projects

  • hamt

    A hash array-mapped trie implementation in C

  • Project mention: Visual Introduction to Hash-Array Mapped Tries (HAMTs) | news.ycombinator.com | 2023-08-24

    This isn't a very good explanation. The wikipedia article isn't great either. I like this description:

    https://github.com/mkirchner/hamt#persistent-hash-array-mapp...

    The name does tell you quite a bit about what these are:

    * Hash - rather than directly using the keys to navigate the structure, the keys are hashed, and the hashes are used for navigation. This turns potentially long, poorly-distributed keys into short, well-distributed keys. However, that does mean you have to compute a hash on every access, and have to deal with hash collisions. The mkirchner implementation above calls collisions "hash exhaustion", and deals with them using some generational hashing scheme. I think i'd fall back to collision lists until that was conclusively proven to be too slow.

    * Trie - the tree is navigated by indexing nodes using chunks of the (hash of the) key, rather than comparing the keys in the node

    * Array mapped - sparse nodes are compressed, using a bitmap to indicate which logical slots are occupied, and then only storing those. The bitmaps live in the parent node, rather than the node itself, i think? Presumably helps with fetching.

    A HAMT contains a lot of small nodes. If every entry is a bitmap plus a pointer, then it's two words, and if we use five-bit chunks, then each node can be up to 32 entries, but i would imagine the majority are small, so a typical node might be 64 bytes. I worry that doing a malloc for each one would end up with a lot of overhead. Are HAMTs often implemented with some more custom memory management? Can you allocate a big block and then carve it up?

    Could you do a slightly relaxed HAMT where nodes are not always fully compact, but sized to the smallest suitable power of two entries? That might let you use some sort of buddy allocation scheme. It would also let you insert and delete without having to reallocate the node. Although i suppose you can already do that by mapping a few empty slots.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C datastructure related posts

  • Visual Introduction to Hash-Array Mapped Tries (HAMTs)

    2 projects | news.ycombinator.com | 24 Aug 2023

Index

Project Stars
1 hamt 260

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com