hamt
immer
hamt | immer | |
---|---|---|
7 | 25 | |
261 | 2,433 | |
- | - | |
6.9 | 6.2 | |
3 months ago | 3 days ago | |
C | C++ | |
MIT License | Boost Software License 1.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hamt
-
Visual Introduction to Hash-Array Mapped Tries (HAMTs)
This isn't a very good explanation. The wikipedia article isn't great either. I like this description:
https://github.com/mkirchner/hamt#persistent-hash-array-mapp...
The name does tell you quite a bit about what these are:
* Hash - rather than directly using the keys to navigate the structure, the keys are hashed, and the hashes are used for navigation. This turns potentially long, poorly-distributed keys into short, well-distributed keys. However, that does mean you have to compute a hash on every access, and have to deal with hash collisions. The mkirchner implementation above calls collisions "hash exhaustion", and deals with them using some generational hashing scheme. I think i'd fall back to collision lists until that was conclusively proven to be too slow.
* Trie - the tree is navigated by indexing nodes using chunks of the (hash of the) key, rather than comparing the keys in the node
* Array mapped - sparse nodes are compressed, using a bitmap to indicate which logical slots are occupied, and then only storing those. The bitmaps live in the parent node, rather than the node itself, i think? Presumably helps with fetching.
A HAMT contains a lot of small nodes. If every entry is a bitmap plus a pointer, then it's two words, and if we use five-bit chunks, then each node can be up to 32 entries, but i would imagine the majority are small, so a typical node might be 64 bytes. I worry that doing a malloc for each one would end up with a lot of overhead. Are HAMTs often implemented with some more custom memory management? Can you allocate a big block and then carve it up?
Could you do a slightly relaxed HAMT where nodes are not always fully compact, but sized to the smallest suitable power of two entries? That might let you use some sort of buddy allocation scheme. It would also let you insert and delete without having to reallocate the node. Although i suppose you can already do that by mapping a few empty slots.
- Show HN: A hash array-mapped trie implementation in C
- Ask HN: What are some 'cool' but obscure data structures you know about?
immer
-
Text Editor Data Structures: Rethinking Undo
I've been working on an editor (not text) in C++ and pretty early got into undo/redo. I went down the route of doIt/undoIt for commands but that quickly got old. There was both the extra work needed to implement undo separately for every operation, but also the nagging feeling that the undo operation for some operation wasn't implemented correctly.
In the end, I switched to representing the entire document state using persistent data structures (using the immer library). This vastly simplified things and implementing undo/redo becomes absolutely trivial when using persistent data structures. It's probably not something that is suitable for all domains, but worth checking out.
https://github.com/arximboldi/immer
-
Show HN: A hash array-mapped trie implementation in C
How does this compare to https://github.com/arximboldi/immer (other than the C/C++ difference)?
Also, it's my understanding that, in practice, persistent data structures require a garbage collector in order to handle deallocation when used in a general-purpose way. How does your implementation handle that?
-
Text Editor Data Structures
You might be interested in ewig and immer by Juan Pedro Bolivar Puente:
https://github.com/arximboldi/ewig
https://github.com/arximboldi/immer
See the author instantly opening a ~1GB text file with async loading, paging through, copying/pasting, and undoing/redoing in their prototype “ewig” text editor about 27 minutes into their talk here:
https://m.youtube.com/watch?v=sPhpelUfu8Q
It’s backed by a “vector of vectors” data structure called a relaxed radix balanced tree:
https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf
That original paper has seen lots of attention and attempts at performance improvements, such as:
https://hypirion.com/musings/thesis
https://github.com/hyPiRion/c-rrb
-
value semantics and spans/views
You’re absolutely right, however people have been putting in the “extra efforts” required for efficiency. Check out immer if you’re interested.
-
How to synchronize access to application data in multithreaded asio?
The C++ immer library: https://github.com/arximboldi/immer
-
Purely Functional Data Structure by Chris Okasaki [pdf]
For C++ check this one out - https://github.com/arximboldi/immer
- Persistent and immutable data structures written in C++14
-
Introducing B++ Trees, a C++ B+ Tree library
Yeah I agree that I should link that wikipedia page in the docs, I'll do that as soon as I get a chance. immer (https://github.com/arximboldi/immer) also links that page in its docs, for the exact same reason I'm sure. Interestingly, there is a lot of overlap between persistent data structures in the functional programming sense and persistent data structures in the persisted-to-disk sense because persistent data structures in the FP sense are one of the best ways to guarantee atomic updates and safe failure recovery in a persisted-to-disk system! Btrfs and ZFS, as well as many databases, are at their core basically just copy-on-write B+ trees.
-
What are some architectural patterns for creating a game editor.
I’ve never tried it, but I love the idea of implementing editor scene state using immutable data structures like https://github.com/arximboldi/immer With that, every edit would append a new node to a list of scene states. Undo/redo becomes iterating your view of the scene up and down through that list. Can’t screw up an undo function if there’s never any work to do :P
-
TypeScript Without Side Effects
I have! I think it's related to the C++ immer library which I used several years ago in Vortex. It's kinda like the previous generation of ValueScript. 🍻
What are some alternatives?
AspNetCoreDiagnosticScenarios - This repository has examples of broken patterns in ASP.NET Core applications
babashka - Native, fast starting Clojure interpreter for scripting
multiversion-concurrency-contro
clj-kondo - Static analyzer and linter for Clojure code that sparks joy
RVS_Generic_Swift_Toolbox - A Collection Of Various Swift Tools, Like Extensions and Utilities
graalvm-clojure - This project contains a set of "hello world" projects to verify which Clojure libraries do actually compile and produce native images under GraalVM.
multiversion-concurrency-control - Implementation of multiversion concurrency control, Raft, Left Right concurrency Hashmaps and a multi consumer multi producer Ringbuffer, concurrent and parallel load-balanced loops, parallel actors implementation in Main.java, Actor2.java and a parallel interpreter
ewig - The eternal text editor — Didactic Ersatz Emacs to show immutable data-structures and the single-atom architecture
CPython - The Python programming language
deprecated-coalton-prototype - Coalton is (supposed to be) a dialect of ML embedded in Common Lisp.
pyroscope - Continuous Profiling Platform. Debug performance issues down to a single line of code [Moved to: https://github.com/grafana/pyroscope]
awesome-modern-cpp - A collection of resources on modern C++