Memchr 2.4 now has an implementation of substring search on arbitrary bytes

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • sliceslice-rs

    A fast implementation of single-pattern substring search using SIMD acceleration.

  • Aside from that, their SIMD implementation is better optimized than the one I wrote. Aside from the codegen problem I talked about on that PR, sliceslice does better with its confirmation step by specializing calls to memcmp for all needles up to length 16. This repeats the entire implementation 16 times or so (for each of SSE2 and AVX2, so 32 in total I believe), but lets the memcmp call be a bit better than a generic one. We could do the same in memchr, but I wanted to see how much mileage we could get with fewer copies of the code and a lower latency implementation of memcmp.

  • rust-memchr

    Optimized string search routines for Rust.

  • Aside from that, their SIMD implementation is better optimized than the one I wrote. Aside from the codegen problem I talked about on that PR, sliceslice does better with its confirmation step by specializing calls to memcmp for all needles up to length 16. This repeats the entire implementation 16 times or so (for each of SSE2 and AVX2, so 32 in total I believe), but lets the memcmp call be a bit better than a generic one. We could do the same in memchr, but I wanted to see how much mileage we could get with fewer copies of the code and a lower latency implementation of memcmp.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • regex

    An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.

  • There is also regex-syntax. And at some point, I'm going to be adding regex-automata: https://github.com/rust-lang/regex/issues/656

  • regex-automata

    Discontinued A low level regular expression library that uses deterministic finite automata.

  • (The work on regex-automata 0.2 has been underway for over a year now.](https://github.com/BurntSushi/regex-automata/tree/ag/work) There's a lot done, but still a lot more to go. Once that's done, regex proper should be pretty close to a thin layer that glues regex-syntax, regex-automata, memchr and aho-corasick together. I don't currently expect regex to grow any more dependencies than that. And as it is, aho-corasick and memchr are both optional dependencies. Right now, regex-syntax is the only required dependency, but regex-automata will be added to that list.

  • ripgrep

    ripgrep recursively searches directories for a regex pattern while respecting your gitignore

  • Oh those have been out-dated for a loooong time. I do occasionally re-run the benchmark suite when I get a chance. The last time I did was October 2020: https://github.com/BurntSushi/ripgrep/blob/master/benchsuite/runs/2020-10-14-archlinux-frink/summary (Note that I've removed some tools from the benchmarks, because they are no longer interesting to benchmark.)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts