Our great sponsors
-
slitter
Slitter is a C- and Rust-callable slab allocator implemented primarily in Rust, with some C for performance or to avoid unstable Rust features.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
You use a different interface than malloc/free and ask the programmer to pass in a class tag. https://github.com/backtrace-labs/slitter/blob/7afb9781fd25b...
I'm working on a memory profiler for Python that is fast enough to run in production (see link below), so I've ended up with some similar problems re performance and importance of testing.
A few things the article talks about where one can maybe do even better:
1. likely()/unlikely() not being in stable Rust. This is true, but the hashbrown project has some hacked-up variants it claims work on stable: https://github.com/rust-lang/hashbrown/blob/bbda6e0077bafb75...
2. Rust not having fast thread locals. Same problem for me, so likewise did it in C with "initial-exec". But! If you use clang, you can get LTO across C and Rust, so you can get fast thread locals _and_ not have function call overhead. Basically need to use same version of Clang as Rust does (12 at the moment) and do a little song and dance in linker and compiler flags. See https://matklad.github.io/2020/10/03/fast-thread-locals-in-r...
3. For testing these sort of things, being able to assert "this test touched this code path" is extremely useful. In my case, for example, I have different code paths for sampled and unsampled allocations, but from perspective of code calling malloc() everything should be identical. So how to tell if correct code path was used? Coverage marks are a great solution for this: https://ferrous-systems.com/blog/coverage-marks/
(The Python profiler, if anyone is interested: I've already released an open source memory profiler that tracks all allocations, https://pythonspeed.com/fil/. Unlike most memory profilers it tells you source-of-allocations at peak memory, which is key for data processing applications. The commercial production-grade version I'm not working on uses sampling, and will be even more focused on data processing batch applications; the goal is to have essentially no performance overhead so it can always be on.)
Well, unfortunately, address space isn't unlimited. In memory experience developing 64-bit wasm engines, the typical approach of reserving 8GB of memory for each wasm memory leads to address space exhaustion after only about 1000 memories (ofc, regardless of how many actual pages those memories use). Therefore for a long living VM that might load and unload hundreds or even thousands of programs, it's really important to clean up those reservations promptly.
In V8, cleaning up the reservations for Wasm memories is ultimately tied to garbage collection of the JS heap, since JS objects can root memory. That led to the somewhat clunky need to retry allocation in a loop, running a full GC if allocation fails the first few times (https://github.com/v8/v8/blob/master/src/objects/backing-sto...).
Related posts
- What does the code look like for built-in functions?
- What's happening with JavaScript Array References under the hood?
- FAMILIA PQ NAO TEM VAGA EM C E C++ NESSE MERCADO **********?????
- [AskJS] Do you have to be a natural talent to reach deep knowledge?
- [AskJS] Who first used the term "spread operator" re spread syntax ...?