cw
tantivy
cw | tantivy | |
---|---|---|
5 | 18 | |
100 | 5,829 | |
- | - | |
0.0 | 9.3 | |
over 1 year ago | over 2 years ago | |
Rust | Rust | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cw
-
why GNU grep is fast
For things that are commonly and almost-ideally represented as text files, there’s a lot of Rust based alternatives are faster and have more features than the old unix/GNU tools: ripgrep, fd, cw, and you can find more in this list.
-
A wc clone, written in Go
Nice, beats my old Rust wc through sheer brute force on my old 12c/24t server:
-
How to learn Rust by own tiny applications?
A lot of unix-y tools have been rewritten in rust, where the usefulness comes from it being faster or having more features. Examples: bat, cw, lsd, ripgrep, diskonaut, gping. Maybe you could find an interesting program to rewrite?
-
Awesome Rewrite It In Rust - A curated list of replacements for existing software written in Rust
cw, an optionally-multithreaded bytecount-accelerated wc clone
-
Debian Running on Rust Coreutils
Having written a Rust wc implementation a few years ago (https://github.com/Freaky/cw), I had a look at theirs.
It's pretty naive - a simple linewise read_until loop, a conditional to avoid word splitting and such if it's not needed, and for some reason it collects results into an array and prints when it's done rather than printing as it goes.
It doesn't support --files0-from like GNU wc, so isn't a drop-in replacement from that perspective. It also has the sadly common Rust trope of only supporting filenames that are valid UTF-8.
It doesn't seem overly slow considering its simplicity - usually trading blows with GNU and BSD wc. Perhaps the most glaring omission is the lack of a fast path for -c, which should reduce to a stat() call. Also unfortunate not to use the excellent bytecount crate to provide a very fast -l/m path.
The read_until loop also makes its memory use unpredictable compared with other wc's. If you run it on /dev/zero it will try to eat your computer.
tantivy
-
Hey y'all back again w/ the personal, self-hosted search engine
Backend uses tantivy to index the web pages, sqlite3 to hold metadata / crawl queue
- Ask HN: What are some good rust code to read to learn the language?
-
Looking for recommendations of well maintained open source rust codebases that I can look through/contribute to
Tantivy is a very well made library and also follows alot of the best practices if you like search you'll like this: https://github.com/quickwit-inc/tantivy
-
self hosted elasticsearch alternative
tantivy - More of a search engine library than out of the box solution
-
Whats your favourite open source Rust project that needs more recognition?
Tantivy search engine.
-
Is there a library for instant arbitrary text searching?
You could try the Tantivy crate, with an n-gram tokenizer, which would split and index your text in sliding groups of n characters.
-
Zest: a CLI tool for zettelkasten-like note management
I had to look up the "tantivy" that README mentions. https://github.com/tantivy-search/tantivy. Might want to add a link to the project in your README.
-
Are you using Rust at work? If yes, for what?
We're using Rust for a domain-specific search engine. When I first learned Rust some years ago my first thought was that this language is perfect for heavy text processing. IMO, &str is that single killer feature that got me sold :) The search engine that we're building is based on https://github.com/tantivy-search/tantivy.
- Tantivy, a full-text search engine library in Rust inspired by Apache Lucene
-
Tantivy v0.15 released! Now backed by Quickwit Inc.!
Well spotted. Like IPFS, there's a comment about that here: https://github.com/tantivy-search/tantivy/pull/1067#issuecomment-853139923 that points to the distributed wikipedia mirror project https://github.com/ipfs/distributed-wikipedia-mirror/issues/76
What are some alternatives?
gping - Ping, but with a graph
sonic - 🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
CompactGUI - Transparently compress active games and programs using Windows 10/11 APIs [Moved to: https://github.com/IridiumIO/CompactGUI]
tantivy-wasm
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
pueue - :stars: Manage your shell commands.
ht - Friendly and fast tool for sending HTTP requests
neon - Rust bindings for writing safe and fast native Node.js modules.
nushell - A new type of shell
neuron - Future-proof note-taking and publishing based on Zettelkasten (superseded by Emanote: https://github.com/srid/emanote)
awesome-rewrite-it-in-rust - A curated list of replacements for existing software written in Rust [Moved to: https://github.com/TaKO8Ki/awesome-alternatives-in-rust]
zk - A plain text note-taking assistant