How to use mmap safely in Rust?

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • monokakido

    A Rust library for parsing and interpreting the Monokakido dictionary format. Full test coverage and efficient implementation with minimal dependencies.

  • I'm developing a library and a CLI tool to parse a certain dictionary format: https://github.com/golddranks/monokakido/ (The format of a dictionary app called Monokakido: https://www.monokakido.jp/en/dictionaries/app/ )

  • rfcs

    RFCs for changes to Rust

  • Then there are some recent developments like the Atomic memcpy RFC: https://github.com/rust-lang/rfcs/pull/3301 Memory maps aren't specifically mentioned, but they seem relevant. If mmap returning a &[AtomicPerByte] would solve the problem, I'd readily welcome it. Having an actual type to represent the (lack of) guarantees of the memory layout might actually bring some ergonomic benefits too. At the moment, if I go with read_volatile, I'd have to reimplement some basic stuff like string comparison and copying using volatile lookups.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mmap_experiment

  • The solution for this, is to either create a temp file and use reflink (note that this is a fork and we haven't released it yet) to create CoW reference to data in the temp file, or use some other mechanisms to prevent others from modifying the file (e.g. setting it to be read-only).

  • fst

    Represent large sets and maps compactly with finite state transducers.

  • The fst crate effectively relies on mmap for it to work right. The folks here suggesting you just use the heap might be right, but only if using the heap is actually plausible. If your dictionary is GBs big (an FST might be bigger than available memory), then copying it the heap first would be disastrous.

  • imdb-rename

    A command line tool to rename media files based on titles from IMDb.

  • imdb-rename is an example of a tool that memory maps FSTs on disk in order to execute fulltext searches very quickly on the command line.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts