bu
ripgrep
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
bu
-
Nim
I think Nim is great for small CLIs. Some examples are over at: https://github.com/c-blake/bu . To quantify "small", using tools themselves in bu/ (and Zsh *):
wc -l --total=never **.nim|cols 1|cstats ms q.05 q.95
-
fdupes: Identify or Delete Duplicate Files
200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
[1] https://github.com/c-blake/bu/blob/main/dups.nim
[2] https://github.com/jbruchon/jdupes
-
Things I've learned about building CLI tools in Python
You better off with using a compiled language.
If you interested in a language that's compiled, fast, but as easy and pleasant as Python - I'd recommend you take a look at [Nim](https://nim-lang.org).
And to prove what Nim's capable of - here's a cool repo with 100+ cli apps someone wrote in Nim: [c-blake/bu](https://github.com/c-blake/bu)
-
Removing Garbage Collection from the Rust Language (2013)
20 milliseconds? On my 7 year old Linux box, this little Nim program https://github.com/c-blake/bu/blob/main/wsz.nim runs to completion in 275 microseconds when fully statically linked with musl libc on Linux. That's with a stripped environment (with `env -i`). It takes more like 318 microseconds with my usual 54 environment variables. The program only does about 17 system calls, though.
Additionally, https://github.com/c-blake/cligen makes decent CLI tools a real breeze. If you like some of Go's qualities but the language seems too limited, you might like Nim: https://nim-lang.org. I generally find getting good performance much less of a challenge with Nim, but Nim is undeniably less well known with a smaller ecosystem and less corporate backing.
-
The Awk book’s 60-line version of Make
Often whole program generation in a prog.lang (& ecosystem!) that you already know can substitute for a new prog.lang. Python even has eval. You may be interested in: https://github.com/c-blake/bu/blob/main/doc/rp.md
You can actually get pretty far depending upon boundaries with the always implicit command-option language (when launched from the shell language, anyway). For example, Ben's example can be adapted to:
rp -m^\[A-Za-z\] 'echo nr," ",s[1]'
-
Learn GNU Awk with hundreds of examples and exercises
You might consider: https://github.com/c-blake/bu/blob/main/doc/cols.md
That's in Nim, though that may not be much a barrier. (There may also be other tools in bu/ of interest.)
-
GNU Parallel, where have you been all my life?
This sounds like a job for what standard C calls "popen". You can do `import posix; for line in popen("ls", "r"): echo line` in Nim, though you obviously need to replace `echo line` with other desired processing and learn how to do that.
You might also want to consider `rp` which is a program generator-compiler-runner along the lines of `awk` but with all the code just Nim snippets interpolated into a program template: https://github.com/c-blake/bu/blob/main/doc/rp.md . E.g.:
ls -l | rp -pimport\ stats -bvar\ r:RunningStat -wnf\>4 r.push\ 4.f -eecho\ r
-
The Bipolar Lisp Programmer
Nim is terse yet general and can be made even more so with effort. E.g., You can gin up a little framework that is even more terse than awk yet statically typed and trivially convertible to run much faster like https://github.com/c-blake/bu/blob/main/doc/rp.md
You can statically introspect code to then generate related/translated ASTs to create nearly frictionless helper facilities like https://github.com/c-blake/cligen .
You can do all of this without any real run-time speed sacrifices, depending upon the level of effort you put in / your expertise. Since it generates C/C++ or Javascript you get all the abilities of backend compilers almost out of the box, like profile-guided-optimization or for JS JIT compilation.
-
Ask HN: Why did Nim not catch-on like wild fire as Rust did?
I don't know about all your other questions, but the https://github.com/c-blake/cligen CLI framework seems much lower effort / ceremony than even Rust's `argh` and is just about as old as `clap` (both started 8 years ago in 2015).
There are over 50 CLI utilities at https://github.com/c-blake/bu, many of which do something novel rather than just "re-doing ls/find/cat with a twist". While they are really more an "ls/ps construction toolkits" with some default configs to get people going, I think https://github.com/c-blake/lc and https://github.com/c-blake/procs are nicer than Rust alternatives. I mention these since you seem interested in such tools.
-
Self Hosted SaaS Alternatives
You are welcome. Thanks are too rarely offered. :-)
You may also be interested in word stemming ( such as used by snowball stemmer in https://github.com/c-blake/nimsearch ) or other NLP techniques, but I don't know how internationalized/multi-lingual that stuff is, but conceptually you might want "series of stemmed words" to be the content fragments of interest.
Similarity scores have many applications. Weights on graph of cancelled downloads ranked by size might be one. :)
Of course, for your specific "truncation" problem, you might also be able to just do an edit distance against the much smaller filenames and compare data prefixes in files or use a SHA256 of a content-based first slice. ( There are edit distance algos in Nim in https://github.com/c-blake/cligen/blob/master/cligen/textUt.... as well as in https://github.com/c-blake/suggest ).
Or, you could do a little program like ndup/sh/ndup to create a "mirrored file tree" of such content-based slices then you could use any true duplicate-file finder (like https://github.com/c-blake/bu/blob/main/dups.nim) on the little signature system to identify duplicates and go from path suffixes in those clusters back to the main filesystem. Of course, a single KV store within one or two files would be more efficient than thousands of tiny files. There are many possibilities.
ripgrep
-
Ask HN: What software sparks joy when using?
ripgrep - https://github.com/BurntSushi/ripgrep
-
Code Search Is Hard
Basic code searching skills seems like something new developers are never explicitly taught, but which is an absolutely crucial skill to build early on.
I guess the knowledge progression I would recommend would look something kind this:
- Learning about Ctrl+F, which works basically everywhere.
- Transitioning to ripgrep https://github.com/BurntSushi/ripgrep - I wouldn't even call this optional, it's truly an incredible and very discoverable tool. Requires keeping a terminal open, but that's a good thing for a newbie!
- Optional, but highly recommended: Learning one of the powerhouse command line editors. Teenage me recommended Emacs; current me recommends vanilla vim, purely because some flavor of it is installed almost everywhere. This is so that you can grep around and edit in the same window.
- In the same vein, moving back from ripgrep and learning about good old fashioned grep, with a few flags rg uses by default: `grep -r` for recursive search, `grep -ri` for case insensitive recursive search, and `grep -ril` for case insensitive recursive "just show me which files this string is found in" search. Some others too, season to taste.
- Finally hitting the wall with what ripgrep can do for you and switching to an actual indexed, dedicated code search tool.
-
Level Up Your Dev Workflow: Conquer Web Development with a Blazing Fast Neovim Setup (Part 1)
live grep: ripgrep
- Ripgrep
-
Modern Java/JVM Build Practices
The world has moved on though to opinionated tools, and Rust isn't even the furthest in that direction (That would be Go). The equivalent of those two lines in Cargo.toml would be this example of a basic configuration from the jacoco-maven-plugin: https://www.jacoco.org/jacoco/trunk/doc/examples/build/pom.x... - That's 40 lines in the section to do the "defaults".
Yes, you could add a load of config for files to include/exclude from coverage and so on, but the idea that that's a norm is way more common in Java projects than other languages. Like here's some example Cargo.toml files from complicated Rust projects:
Servo: https://github.com/servo/servo/blob/main/Cargo.toml
rust-gdext: https://github.com/godot-rust/gdext/blob/master/godot-core/C...
ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/Cargo.toml
socketio: https://github.com/1c3t3a/rust-socketio/blob/main/socketio/C...
-
Ugrep – a more powerful, ultra fast, user-friendly, compatible grep
I'm not clear on why you're seeing the results you are. It could be because your haystack is so small that you're mostly just measuring noise. ripgrep 14 did introduce some optimizations in workloads like this by reducing match overhead, but I don't think it's anything huge in this case. (And I just tried ripgrep 13 on the same commands above and the timings are similar if a tiny bit slower.)
[1]: https://github.com/radare/ired
[2]: https://github.com/BurntSushi/ripgrep/discussions/2597
- Tell HN: My Favorite Tools
-
Potencializando Sua Experiência no Linux: Conheça as Ferramentas em Rust para um Desenvolvimento Eficiente
Explore o Ripgrep no repositório oficial: https://github.com/BurntSushi/ripgrep
-
Scrybble is the ReMarkable highlights to Obsidian exporter I have been looking for
🔎🗃️ ripgrep or ugrep (search fast, use regex patterns or fuzzy search, pipe output to bash/zsh shell for further processing V coloring)
- RFC: Add ngram indexing support to ripgrep (2020)
What are some alternatives?
NimForUE - Nim plugin for UE5 with native performance, hot reloading and full interop that sits between C++ and Blueprints. This allows you to do common UE workflows like for example to extend any UE class in Nim and extending it again in Blueprint if you wish so without restarting the editor. The final aim is to be able to do in Nim what you can do in C++
telescope-live-grep-args.nvim - Live grep with args
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
fd - A simple, fast and user-friendly alternative to 'find'
ordiri
ugrep - ugrep 5.1: A more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
OffensiveNim - My experiments in weaponizing Nim (https://nim-lang.org/)
the_silver_searcher - A code-searching tool similar to ack, but faster.
awesome-selfhosted - A list of Free Software network services and web applications which can be hosted on your own servers
fzf - :cherry_blossom: A command-line fuzzy finder
core - OPNsense GUI, API and systems backend
alacritty - A cross-platform, OpenGL terminal emulator.