regex-automata
sliceslice-rs
regex-automata | sliceslice-rs | |
---|---|---|
5 | 2 | |
349 | 87 | |
- | - | |
0.0 | 5.9 | |
10 months ago | 3 months ago | |
Rust | Rust | |
The Unlicense | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
regex-automata
-
regex 1.8.0 released (no-op escapes allowed, (?<name>re) syntax added)
I believe you're the second person to tell me they were confused by this, so there are probably several others confused but didn't say anything. I've added a warning to the top of regex-automata's README.
-
After years of work and discussion, `once_cell` has been merged into `std` and stabilized
For anyone following along at home, we're having a very helpful discussion about the implementation I posted in my sibling comment here: https://github.com/BurntSushi/regex-automata/issues/30
-
Pomsky 0.8 released: A powerful and modern regular expression language
My current technique only gets applied to alternations of simple literals. But the idea is generalizeable and I speculate that it is actually impactful to generalize it.
-
Rust: A Critical Retrospective
(I could use '_ => {}' instead of 'None' to save a few more.)
I do find the 'if let' variant to be a bit easier to read. It's optimizing for a particular and somewhat common case, so it does of course overlap with 'match'. But I don't find this particular overlap to be too bad. It's usually pretty clear when to use one vs the other.
But like I said, I could live without 'if let'. It is not a major quality of life enhancement to me. Neither will its impending extensions. i.e., 'if let pattern = foo && some_booolean_condition {'.
[1]: https://github.com/BurntSushi/regex-automata/blob/fbae906823...
[2]: https://github.com/BurntSushi/regex-automata/blob/fbae906823...
-
Memchr 2.4 now has an implementation of substring search on arbitrary bytes
(The work on regex-automata 0.2 has been underway for over a year now.](https://github.com/BurntSushi/regex-automata/tree/ag/work) There's a lot done, but still a lot more to go. Once that's done, regex proper should be pretty close to a thin layer that glues regex-syntax, regex-automata, memchr and aho-corasick together. I don't currently expect regex to grow any more dependencies than that. And as it is, aho-corasick and memchr are both optional dependencies. Right now, regex-syntax is the only required dependency, but regex-automata will be added to that list.
sliceslice-rs
-
Memchr 2.4 now has an implementation of substring search on arbitrary bytes
Aside from that, their SIMD implementation is better optimized than the one I wrote. Aside from the codegen problem I talked about on that PR, sliceslice does better with its confirmation step by specializing calls to memcmp for all needles up to length 16. This repeats the entire implementation 16 times or so (for each of SSE2 and AVX2, so 32 in total I believe), but lets the memcmp call be a bit better than a generic one. We could do the same in memchr, but I wanted to see how much mileage we could get with fewer copies of the code and a lower latency implementation of memcmp.
What are some alternatives?
pomsky - A new, portable, regular expression language
nsimd - Agenium Scale vectorization library for CPUs and GPUs
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
grex - A command-line tool and Rust library with Python bindings for generating regular expressions from user-provided test cases
rust-memchr - Optimized string search routines for Rust.
volk - The Vector Optimized Library of Kernels
biscuit - Biscuit research OS
ripgrep - ripgrep recursively searches directories for a regex pattern while respecting your gitignore
re2 - R interface to Google re2 (C++) regular expression engine
highway - Performance-portable, length-agnostic SIMD with runtime dispatch