ripgrep
xsv
Our great sponsors
ripgrep | xsv | |
---|---|---|
348 | 64 | |
44,594 | 10,058 | |
- | - | |
9.3 | 0.0 | |
16 days ago | about 2 months ago | |
Rust | Rust | |
The Unlicense | The Unlicense |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ripgrep
-
Ask HN: What software sparks joy when using?
ripgrep - https://github.com/BurntSushi/ripgrep
-
Code Search Is Hard
Basic code searching skills seems like something new developers are never explicitly taught, but which is an absolutely crucial skill to build early on.
I guess the knowledge progression I would recommend would look something kind this:
- Learning about Ctrl+F, which works basically everywhere.
- Transitioning to ripgrep https://github.com/BurntSushi/ripgrep - I wouldn't even call this optional, it's truly an incredible and very discoverable tool. Requires keeping a terminal open, but that's a good thing for a newbie!
- Optional, but highly recommended: Learning one of the powerhouse command line editors. Teenage me recommended Emacs; current me recommends vanilla vim, purely because some flavor of it is installed almost everywhere. This is so that you can grep around and edit in the same window.
- In the same vein, moving back from ripgrep and learning about good old fashioned grep, with a few flags rg uses by default: `grep -r` for recursive search, `grep -ri` for case insensitive recursive search, and `grep -ril` for case insensitive recursive "just show me which files this string is found in" search. Some others too, season to taste.
- Finally hitting the wall with what ripgrep can do for you and switching to an actual indexed, dedicated code search tool.
-
Level Up Your Dev Workflow: Conquer Web Development with a Blazing Fast Neovim Setup (Part 1)
live grep: ripgrep
-
Modern Java/JVM Build Practices
The world has moved on though to opinionated tools, and Rust isn't even the furthest in that direction (That would be Go). The equivalent of those two lines in Cargo.toml would be this example of a basic configuration from the jacoco-maven-plugin: https://www.jacoco.org/jacoco/trunk/doc/examples/build/pom.x... - That's 40 lines in the section to do the "defaults".
Yes, you could add a load of config for files to include/exclude from coverage and so on, but the idea that that's a norm is way more common in Java projects than other languages. Like here's some example Cargo.toml files from complicated Rust projects:
Servo: https://github.com/servo/servo/blob/main/Cargo.toml
rust-gdext: https://github.com/godot-rust/gdext/blob/master/godot-core/C...
ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/Cargo.toml
socketio: https://github.com/1c3t3a/rust-socketio/blob/main/socketio/C...
-
Ugrep – a more powerful, ultra fast, user-friendly, compatible grep
Another issue with Hyperscan is that if you enable HS_FLAG_UTF8[1], which hypergrep does[2,3], and then search invalid UTF-8, then the result is UB.
> This flag instructs Hyperscan to treat the pattern as a sequence of UTF-8 characters. The results of scanning invalid UTF-8 sequences with a Hyperscan library that has been compiled with one or more patterns using this flag are undefined.
That's another issue you'll need to grapple with if you use Hyperscan. PCRE2 used to have this issue[4], but they've since defined the semantics of searching invalid UTF-8 with Unicode mode enabled. ripgrep 14 uses that new mode, but I haven't updated that FAQ answer yet.
[1]: https://intel.github.io/hyperscan/dev-reference/api_files.ht...
[2]: https://github.com/p-ranav/hypergrep/blob/ee85b713aa84e0050a...
[3]: https://github.com/p-ranav/hypergrep/blob/ee85b713aa84e0050a...
[4]: https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#why...
I'm not clear on why you're seeing the results you are. It could be because your haystack is so small that you're mostly just measuring noise. ripgrep 14 did introduce some optimizations in workloads like this by reducing match overhead, but I don't think it's anything huge in this case. (And I just tried ripgrep 13 on the same commands above and the timings are similar if a tiny bit slower.)
Interesting, it supports an n-gram indexer. ripgrep has had this planned for a few years now [1] but hasn't implemented it yet. For large codebases I've been using csearch, but it has a lot of limitations.
Unfortunately... I just tried the indexer and it's extremely slow on my machine. It took 86 seconds to index a Linux kernel tree, while csearch's cindex tool took 8 seconds.
- Tell HN: My Favorite Tools
-
Potencializando Sua Experiência no Linux: Conheça as Ferramentas em Rust para um Desenvolvimento Eficiente
Explore o Ripgrep no repositório oficial: https://github.com/BurntSushi/ripgrep
xsv
-
Show HN: TextQuery – Query and Visualize Your CSV Data in Minutes
I realize it's not really that comparable since these tools don't support SQL, but a more fully functioned CLI tool is - https://github.com/BurntSushi/xsv
They are both fairly good
- Qsv: Efficient CSV CLI Toolkit
-
Joining CSV Data Without SQL: An IP Geolocation Use Case
I have done some similar, simpler data wrangling with xsv (https://github.com/BurntSushi/xsv) and jq. It could process my 800M rows in a couple of minutes (plus the time to read it out from the database =)
-
Qsv: CSVs sliced, diced and analyzed (fork of xsv)
xsv, which seems to be why qsv was created.
-
I wrote this iCalendar (.ics) command-line utility to turn common calendar exports into more broadly compatible CSV files.
CSV utilities (still haven't pick a favorite one...): https://github.com/harelba/q https://github.com/BurntSushi/xsv https://github.com/wireservice/csvkit https://github.com/johnkerl/miller
- Icsp – Command-line iCalendar (.ics) to CSV parser
-
ripgrep is faster than {grep, ag, git grep, ucg, pt, sift}
$ git remote -v origin [email protected]:rust-lang/rust (fetch) origin [email protected]:rust-lang/rust (push) $ git rev-parse HEAD 3b0d4813ab461ec81eab8980bb884691c97c5a35 $ time grep -ri burntsushi ./ ./src/tools/cargotest/main.rs: repo: "https://github.com/BurntSushi/ripgrep", ./src/tools/cargotest/main.rs: repo: "https://github.com/BurntSushi/xsv", grep: ./target/debug/incremental/cargotest-2dvu4f2km9e91/s-gactj3ma2j-1b10l4z-2l60ur55ixe6n/query-cache.bin: binary file matches grep: ./target/debug/incremental/cargotest-38cpmhhbdgdyq/s-gactj3luwq-1o12vgp-t61hd8qdyp7t/query-cache.bin: binary file matches grep: ./target/debug/incremental/cargotest-17632op6djxne/s-gawuq5468i-1h69nfw-4gm0s8yhhiun/query-cache.bin: binary file matches grep: ./target/debug/incremental/cargotest-2trm4kt5yom3r/s-gawuq53qqg-bjiezj-lo0gha8ign8w/query-cache.bin: binary file matches grep: ./target/debug/deps/libregex_automata-c74a6d9fd0abd77b.rmeta: binary file matches grep: ./target/debug/deps/libsame_file-a0e0363a2985455d.rlib: binary file matches grep: ./target/debug/deps/libsame_file-a0e0363a2985455d.rmeta: binary file matches grep: ./target/debug/deps/libsame_file-7251d8d3586a319b.rmeta: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaho_corasick-999a08e2b700420d.rlib: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libregex_automata-0d168be5d25b3ac5.rlib: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps/libregex_automata-7d6bec0156f15da1.rlib: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps/libregex_automata-7d6bec0156f15da1.rmeta: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps/libaho_corasick-07dee4514b87d99b.rmeta: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps/libaho_corasick-07dee4514b87d99b.rlib: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/libaho_corasick-999a08e2b700420d.rlib: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/libaho_corasick-999a08e2b700420d.rmeta: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/libregex_automata-0d168be5d25b3ac5.rlib: binary file matches grep: ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/libregex_automata-0d168be5d25b3ac5.rmeta: binary file matches grep: ./build/bootstrap/debug/deps/libaho_corasick-992e1ba08ef83436.rmeta: binary file matches grep: ./build/bootstrap/debug/deps/libignore-54d41239d2761852.rmeta: binary file matches grep: ./build/bootstrap/debug/deps/libsame_file-9a5e3ddd89cfe599.rlib: binary file matches grep: ./build/bootstrap/debug/deps/libregex_automata-8e700951c9869a66.rlib: binary file matches grep: ./build/bootstrap/debug/deps/libignore-54d41239d2761852.rlib: binary file matches grep: ./build/bootstrap/debug/deps/libaho_corasick-992e1ba08ef83436.rlib: binary file matches grep: ./build/bootstrap/debug/deps/libregex_automata-8e700951c9869a66.rmeta: binary file matches grep: ./build/bootstrap/debug/deps/libsame_file-9a5e3ddd89cfe599.rmeta: binary file matches real 16.683 user 15.793 sys 0.878 maxmem 8 MB faults 0
-
Any Linux admins willing to try Pygrep?
Unrelated, are you the same burntsushi that wrote xsv?
-
Analyzing multi-gigabyte JSON files locally
If it could be tabular in nature, maybe convert to sqlite3 so you can make use of indexing, or CSV to make use of high-performance tools like xsv or zsv (the latter of which I'm an author).
https://github.com/BurntSushi/xsv
https://github.com/liquidaty/zsv/blob/main/docs/csv_json_sql...
What are some alternatives?
telescope-live-grep-args.nvim - Live grep with args
fd - A simple, fast and user-friendly alternative to 'find'
ugrep - NEW ugrep 5.1: an ultra fast, user-friendly, compatible grep. Ugrep combines the best features of other grep, adds new features, and searches fast. Includes a TUI and adds Google-like search, fuzzy search, hexdumps, searches nested archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
the_silver_searcher - A code-searching tool similar to ack, but faster.
fzf - :cherry_blossom: A command-line fuzzy finder
csvtk - A cross-platform, efficient and practical CSV/TSV toolkit in Golang
alacritty - A cross-platform, OpenGL terminal emulator.
miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
python-regex-cheatsheet - Python 2.7 Regular Expression cheatsheet, as a restructured text document and Makefile to convert it to PDF
Parallel
delta - A syntax-highlighting pager for git, diff, and grep output
Servo - Servo, the embeddable, independent, memory-safe, modular, parallel web rendering engine