vnlog
qsv
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
vnlog
- Vnlog: Process labelled tabular ASCII data using normal Unix tools
- Process tabular data with Unix tools
-
Qsv: Efficient CSV CLI Toolkit
For simple analyses (i.e. what most people do most of the time) doing this on the commandline gets you there faster. I use vnlog (https://github.com/dkogan/vnlog/). By the time you fired up your editor to write your Python code, I already have analyses and plots ready.
-
Joining CSV Data Without SQL: An IP Geolocation Use Case
Alternative very appropriate for some uses cases: `vnl-join` from the vnlog toolkit (https://github.com/dkogan/vnlog). Uses the `join` tool from coreutils (works well, has been around forever), and `vnlog` for nice column labelling
-
Miller: Like Awk, sed, cut, join, and sort for CSV, TSV, and tabular JSON
There's also https://github.com/dkogan/vnlog/ which is a wrapper around the existing coreutils, so all the options work, and there's nothing to learn
- vnlog: making awk and sort and join (and friends) smarter
-
Awk equivalents to SQL query data manipulation
And to improve the ergonomics, the vnlog wrappers are available to operate on field names, while retaining the internals of the core tools:
https://github.com/dkogan/vnlog/
- Vnlog: Making Awk, grep, sort and join smarter
-
Learn to Process Text in Linux Using Grep, Sed, and Awk
I sorta, kinda agree. Tools written in AWK (and friends) are indeed somewhat unmaintainable, but they're really close to being just right for a LOT of applications. The vnlog toolkit (https://github.com/dkogan/vnlog) adds just a little bit of syntactic sugar to the usual commandline tools to make processing scripts robust and easy to read and write. This was not my intent initially, but I now do most of my data processing with the shell and vnl-wrapped awk (and sort and join, ...) It's really nice. If you write stuff in awk, you should check it out. (Disclaimer: I'm the author)
- Extending Awk with Field Labels
qsv
- FLaNK Weekly 31 December 2023
-
Qsv: Efficient CSV CLI Toolkit
Thanks for the detailed feedback @snidane!
As maintainer of qsv, here's my reply:
- Given qsv's rapid release cycle (173 releases over three years), the auto-update check is essential at the moment. Once we reach 1.0, I'll turn it off. For now, given your feedback, I've only made it check 10% of the time.
- Pivot is in the backlog and I'll be sure to add unpivot when I implement it. (https://github.com/jqnatividad/qsv/issues/799)
- I'll add a dedicated summing command with the group by (-by) and window by (-over) capability (https://github.com/jqnatividad/qsv/issues/1514). Do note that `stats` has basic sum as @ezequiel-garzon pointed out.
- With the `enum` command, qsv can achieve what you proposed with `laminate`. E.g. qsv enum --new-column newcol --constant newconstant mydata.csv --output laminated-data.csv
- With the cat rowskey command, qsv can already concatenate files with mismatched headers.
- other file formats. qsv supports parquet, csv, tsv, excel, ods, datapackage, sqlite and more (see https://github.com/jqnatividad/qsv/tree/master#file-formats). Fixed-format though is not supported yet and quite interesting, and have added it to the backlog (https://github.com/jqnatividad/qsv/issues/1515)
- as to "enable embedding outputs of commands", qsv is composable by design, so you can use standard stdin/stdout redirection/piping techniques to have it work with other CLI tools like jq, awk, etc.
Finally, just released v0.120.0 that already incorporates the less aggressive self-update check. https://github.com/jqnatividad/qsv/releases/tag/0.120.0
- Joining CSV Data Without SQL: An IP Geolocation Use Case
-
Why my favourite API is a zipfile on the European Central Bank's website
qsv [1] also has a sqlp command which lets you run Polars SQL queries (even on multiple files). Here I'll send the csv data from stdin (represented by -) and then (optionally) pipe the output to the table command for formatting. The shape of the result is also printed to stderr (the (4, 2) below).
[1] https://github.com/jqnatividad/qsv
$ echo 'Name,Department,Salary
- Qsv: Ultra-fast CSV data-wrangling toolkit
- Qsv: CSVs sliced, diced and analyzed (fork of xsv)
- Nushell.sh ls โ where size > 10mb โ โsort-by modified
-
Do Rust and Lua work well together?
It works quite well IMHO. Using the mlua crate, Iโve managed to integrate Luau as a very powerful data-wrangling DSL for qsv (https://github.com/jqnatividad/qsv)
-
How manipulate this CSV in Python?
Maybe this might be better done using this? https://github.com/jqnatividad/qsv
-
How to convert xslx to csv using Rust?
https://github.com/jqnatividad/qsv is another option.
What are some alternatives?
ttyplot - a realtime plotting utility for terminal/console with data input from stdin
miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
matplotplusplus - Matplot++: A C++ Graphics Library for Data Visualization ๐๐พ
calamine - A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
RecordStream - commandline tools for slicing and dicing JSON records.
fortune-sheet - A drop-in javascript spreadsheet library that provides rich features like Excel and Google Sheets
nvim-ipy - IPython/Jupyter plugin for Neovim
tsv-utils - eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
jupytext.vim - Vim plugin for editing Jupyter ipynb files via jupytext
csvquote
matplotlib - C++ wrappers around python's matplotlib
goawk - A POSIX-compliant AWK interpreter written in Go, with CSV support