-
regex
An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
For our company, Rust is a solid fit for our problem. We ingest terabytes of logs per day from customers, and we need to perform fast full-text search on a lot of data in parallel using many threads. Various string search crates in Rust, like aho-corasick and regex, are highly optimized and leverage SIMD instructions. (Thank you Andrew Gallant, Alex Crichton, and others for these amazing libraries!)
The SIMD "Teddy" algorithm in aho-corasick for multiple substring search is quite a bit more complicated and uses a lot of different vendor intrinsics. Things like alignr and shuffle, in addition to movemask. See this and this. In the latter link, search for _mm to see all of the addition intrinsics being used.
Related posts
-
Aho-Corasick Algorithm
-
CryptoFlow: Building a secure and scalable system with Axum and SvelteKit - Part 3
-
how to get the index of substring in source string, support unicode in rust.
-
Aho Corasick Algorithm For Efficient String Matching (Python & Golang Code Examples)
-
Aho-corasick (and the regex crate) now uses SIMD on aarch64