-
fzf-tool-launcher
Browse & preview contents of files, and launch tools/pipelines to process a selected file.
I wrote a couple of shell scripts that allow you to build command pipelines one step at a time, choosing a tool from a menu at each step, and with the ability to preview the results while tweaking the command line flags at each step. At any point you can go back to the previous step and continue: https://github.com/vapniks/fzf-tool-launcher
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
makefile2graph
Creates a graph of dependencies from GNU-Make; Output is a graphiz-dot file or a Gexf-XML file.
Yes, that's extremely important. I've had great success, but replacing Airflow, Luigi, and friends with a cron-ed Makefile target refreshing Some database tables (usually Materialized views).
I've then used this tool to visualize the execution graph. https://github.com/lindenb/makefile2graph
The result looks like this https://tselai.com/data/graph.png
-
I'd like to call out one of my favorite pieces of software from the past 10 years: VisiData [1] has completely changed the way I do ad-hoc data processing, and is now my go-to for pretty much all use cases that I previously used spreadsheets for, and about half of those I previously used databases for.
It's a TUI application, not strictly CLI, but scriptable, and I figure anyone building pipelines using tools like jq, q, awk, grep, etc. to process tabular data will find it extremely useful.
----
[1]: https://visidata.org
-
Thanks, if anyone else is interested there is an explanation of this feature here: https://subtxt.in/library-data/2016/03/28/json_stream_jq And: https://github.com/jqlang/jq/wiki/FAQ#streaming-json-parser
The last time I tried, I think the reason I gave up on JQ for large inputs was that the throughput would max out at 7mb/s whereas the same thing with spark SQL on the same hardware (MacBook) would max out at 250mb/s. So I started looking into using other solutions for big data while I use jq in parallel for small data in multiple files.
I will test it out again cause this was 4-5 years ago when I last tested it, but I believe jaq is still preferred for large inputs. Still I prefer for big data to use Spark/Polars/clickhouse etc.
Related posts
-
Plugin for pretty rendering of data?
-
Hanukkah of Data: Advent of Code for Data Nerds
-
Visidata - work with CSV / SQLlite / xls and other data files from the CLI
-
Visidata 2.9.1 Released– Excel/CSV/JSON/SQLite/Parquet in the Terminal (TUI/CLI)
-
I made RemoteZipFile to download individual files from INSIDE a .zip