htop
rust-memchr
htop | rust-memchr | |
---|---|---|
6 | 33 | |
5,482 | 1,148 | |
- | 4.0% | |
1.2 | 6.0 | |
over 4 years ago | 29 days ago | |
C | Rust | |
GNU General Public License v3.0 only | The Unlicense |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
htop
-
htop VS htop - a user suggested alternative
2 projects | 1 Jun 2023
- A 17-line C program freezes the Mac kernel (2018)
- Wait time for a process
-
Modern alternatives to Unix commands
It was dormant for quite some time but the FOSS community decided to take it over with the original author's blessing.
-
I love open source devs from the bottom of my heart ❤️
yes it's true, here's a github issue thread if you want to read more
- The clock is ticking for my Devuan VM (kernel update etc.). Where has htop's iconic “(!)” after 100+ days of uptime gone?
rust-memchr
-
SIMD-friendly algorithms for substring searching
The "AVX2 (generic)" approach is roughly what ripgrep uses (via Rust's `regex` crate) to accelerate most searches. Even something like `\w+\s+Sherlock\s+\w+` will benefit since ripgrep will pluck `Sherlock` out of the regex and search that.
The actual implementation is here: https://github.com/BurntSushi/memchr?tab=readme-ov-file#algo...
The main difference with the algorithm presented here is that instead of always using the first and last bytes of the needle, a heuristic is used to try to pick two bytes that occur less frequently according to a background frequency distribution.
It ends up being quite a bit faster than just plain Two-Way or even GNU libc's implementation of `memmem`. From the root of the `memchr` repository:
$ rebar rank benchmarks/record/x86_64/2023-12-29.csv -e '^rust/memchr/memmem/(oneshot|prebuilt|twoway)' -e '^libc/memmem/oneshot'
-
Show HN: Krep a High-Performance String Search Utility Written in C
That's probably because pcmpestri is trash for substring search. There is a good reason why ripgrep doesn't use it. :-)
I looked for an authoritative search for why pcmpestri is trash, and I couldn't find anything I was happy linking to other than Agner Fog's instruction tables: https://www.agner.org/optimize/instruction_tables.pdf You can see that the throughput and latency for pcmpestri is just awful.
And yes, not having any code to print the matching lines means that the only code path in krep is just counting things. If that's all your tool is doing, you can totally beat ripgrep or any other tool that is more applicable to generalized use cases. It's why the `memchr` crate (what ripgrep uses for single substring search) has a specialized routine for counting occurrences of bytes (which ripgrep uses for line counting): https://github.com/BurntSushi/memchr/blob/746182171d2e886006...
Because it's faster to do that than it is to reuse the generalized `memchr` API for finding the location of matching bytes.
And counting matches in a multi-threaded context is way easier than actual managing the printing of matches in the same order that you get them.
krep isn't big. You can skim its source code in a few minutes and get a good idea of how it works.
-
Regular Expression Matching Can Be Simple and Fast (2007)
As the author of ripgrep, I wouldn't necessarily buy this. I suppose I might agree with it in the extremes, but SIMD prefilters are quite exceptionally difficult to beat with scalar code in the common cases. Longer patterns are great for the SIMD algorithms in ripgrep because of its use of background frequency distribution heuristics. That is, the longer the pattern, the less likely your candidate detection is to produce a false positive in its hot path.
I don't mean to 100% disagree with you, but I think it's misleading to suggest a sort of one dimensional view of things where, as the pattern gets larger, SIMD gets worse compared to sublinear search algorithms. There are other factors at play here, and, importantly, what "long" means in this context.
In many practical circumstances, "short" might be a few bytes, while "long" is 16 bytes. But maybe your idea of "long" is actually much longer.
If you're curious how your own algorithm stacks up to ripgrep's, you can plug your implementation into the `memchr` crate's benchmark harness: https://github.com/BurntSushi/memchr
It uses rebar: https://github.com/BurntSushi/rebar
- Memchr: Optimized string search routines for Rust
-
Ask HN: What's the fastest programming language with a large standard library?
That's what the `memchr` crate does. It uses `vshrn` just like in your links. And vpmaxq before even bothering with vshrn: https://github.com/BurntSushi/memchr/blob/c6b885b870b6f1b9bf...
-
Rust-Cache
I agree with everything you said, but I don't see how it leads the OP's formulation being silly or wrong or not useful. Here's another example (of my own) where you can pick an upper bound for `n` and base your complexity analysis around it. In this case, we're trying to provide an API guarantee that a search takes O(m+n) time despite wanting to use an O(mn) algorithm in some subset of cases. We can still meet the O(m+n) bound by reasoning that the O(mn) algorithm is only used in a finite set of cases, and thus collapses to O(1) time. Therefore, the O(m+n) time bound is preserved. And this isn't a trick either. That really is the scaling behavior of the implementation. See: https://github.com/BurntSushi/memchr/blob/ce7b8e606410f6c81a...
> If an algorithm is defined as being unscalable (fixed input), what sense does it make to describe that it scales constantly with input size?
I'll answer your question with another: in what cases does it make sense to describe the scaling behavior of algorithm with O(1)?
- Rust memchr adds aarch64 SIMD with impressive speedups
- Stringzilla: Fastest string sort, search, split, and shuffle using SIMD
- Ripgrep now twice as fast on Apple Silicon with new aarch64 SIMD implementations
-
Regex Engine Internals as a Library
The current PR for ARM SIMD[1] uses a different instruction mix to achieve the same goals as movemask. I tested the PR and it has a significant speedup over the non-vectorized version.
[1]https://github.com/BurntSushi/memchr/pull/114
What are some alternatives?
CodeTriage - Discover the best way to get started contributing to Open Source projects
duf - Disk Usage/Free Utility - a better 'df' alternative
NTop - 💻 htop-like system-monitor for Windows with Vi-keybindings.
thefuck - Magnificent app which corrects your previous console command.
xrdp - xrdp: an open source RDP server
bottom - Yet another cross-platform graphical process/system monitor.