-
There are also prefetch instructions. I listened to https://www.youtube.com/watch?v=SetLtBH43_U recently (transcript: https://signalsandthreads.com/memory-management/), part of it talked about some work in OCaml's GC.
> ...each individual memory request that’s not in cache still has to wait the full 300 cycles. But if we can get 10 of them going at the same time, then we can be hitting a new cache line every 30 cycles. I mean, that’s not as good as getting one every 16 cycles, but it’s close. You’re actually able to get to a reasonable fraction of the raw memory bandwidth of the machine while just traversing randomly over this huge one gigabyte heap
https://github.com/ocaml/ocaml/pull/10195 shows the change adding prefetching to the marking phase (https://github.com/ocaml/ocaml/pull/9934 was done earlier for sweeping). There are some benchmarks in the thread.
-
InfluxDB
InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
-
I have some experience writing/modifying linked-list benchmarks (https://github.com/google/multichase) specifically to test memory latency.
It is extremely difficult, maybe impossible, to design a prefetcher that can predict the next cacheline(s) to prefetch in a linked-list. I am not aware of a single CPU that can do this consistently.
-
tigerbeetle
The financial transactions database designed for mission critical safety and performance.
Here's another example for the crowd: an "unbounded" fifo queue in a fixed memory environment.
https://github.com/tigerbeetledb/tigerbeetle/blob/main/src/f...
-
is Erlang and JDK general enough for you? Oh, both use linked lists,
https://github.com/erlang/otp/blob/master/lib/stdlib/src/que...
-
It's at odds with it, but so are many other (efficient) data structures found in the standard library. The solution is usually to use unsafe { } inside the data structure to allow those criss-crossing pointers (while exposing safe, checker-friendly interfaces to the outside world)
This is a great resource that outlines several different strategies for implementing linked lists in Rust: https://rust-unofficial.github.io/too-many-lists/
-
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
C++'s STL linked list for comparison (libcxx).
https://github.com/llvm-mirror/libcxx/blob/master/include/li...
-
I did this for an object pool in re2j and saw even single threaded performance improve.
https://github.com/google/re2j/blob/dc7d6e5d41225dc0825ea6fe...
Java doesn't suffer from pointer address ABA but I did have to handle reinsertion (except when the stack had only one element).
-
freebsd-src
The FreeBSD src tree publish-only repository. Experimenting with 'simple' pull requests....
-
Adding allocator support does add a bit more noise:
https://github.com/rust-lang/rust/pull/103093/files#diff-8bd...
-
https://github.com/AcademySoftwareFoundation/openvdb/issues/...
Longer answer is that the size of the "blocks" is limited to 512 bytes or one element, whichever is larger. So unless your elements are really tiny, it is strictly a pessimization.
-
Some game engine developers have certainly used this idea, to prevent any reallocations when the dynamic array needs to be resized. Such as:
https://ruby0x1.github.io/machinery_blog_archive/post/virtua...
Some ECS implementations use this to reduce the overhead of dynamic arrays, as well as ensuring pointer stability of the items stored. For example entt:
https://skypjack.github.io/2021-06-12-ecs-baf-part-11/
And here's a library implementing a resizable, pointer-stable vector using virtual memory functionality from the OS:
https://github.com/mknejp/vmcontainer
-
btree
Discontinued a simple python btree [GET https://api.github.com/repos/samsquire/btree: 404 - Not Found // See: https://docs.github.com/rest/repos/repos#get-a-repository] (by samsquire)
For a simple and readable implementation of a Python btree see this:
https://github.com/samsquire/btree
I tried to keep the implementation as simple as I could.
-
glibc realloc(3) does this on Linux, by way of mremap(2) with MREMAP_MAYMOVE [1].
[1] https://github.com/bminor/glibc/blob/15a94e6668a6d7c5697e805...
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives