Qsv: Efficient CSV CLI Toolkit

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • qsv

    CSVs sliced, diced & analyzed.

  • Thanks for the detailed feedback @snidane!

    As maintainer of qsv, here's my reply:

    - Given qsv's rapid release cycle (173 releases over three years), the auto-update check is essential at the moment. Once we reach 1.0, I'll turn it off. For now, given your feedback, I've only made it check 10% of the time.

    - Pivot is in the backlog and I'll be sure to add unpivot when I implement it. (https://github.com/jqnatividad/qsv/issues/799)

    - I'll add a dedicated summing command with the group by (-by) and window by (-over) capability (https://github.com/jqnatividad/qsv/issues/1514). Do note that `stats` has basic sum as @ezequiel-garzon pointed out.

    - With the `enum` command, qsv can achieve what you proposed with `laminate`. E.g. qsv enum --new-column newcol --constant newconstant mydata.csv --output laminated-data.csv

    - With the cat rowskey command, qsv can already concatenate files with mismatched headers.

    - other file formats. qsv supports parquet, csv, tsv, excel, ods, datapackage, sqlite and more (see https://github.com/jqnatividad/qsv/tree/master#file-formats). Fixed-format though is not supported yet and quite interesting, and have added it to the backlog (https://github.com/jqnatividad/qsv/issues/1515)

    - as to "enable embedding outputs of commands", qsv is composable by design, so you can use standard stdin/stdout redirection/piping techniques to have it work with other CLI tools like jq, awk, etc.

    Finally, just released v0.120.0 that already incorporates the less aggressive self-update check. https://github.com/jqnatividad/qsv/releases/tag/0.120.0

  • vnlog

    Process labelled tabular ASCII data using normal UNIX tools

  • For simple analyses (i.e. what most people do most of the time) doing this on the commandline gets you there faster. I use vnlog (https://github.com/dkogan/vnlog/). By the time you fired up your editor to write your Python code, I already have analyses and plots ready.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • miller

    Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

  • xsv

    A fast CSV command line toolkit written in Rust.

  • teip

    Masking tape to help commands "do one thing well"

  • citation-file-format

    The Citation File Format lets you provide citation metadata for software or datasets in plaintext files that are easy to read by both humans and machines.

  • I am somewhat tickled at the thought of citing everything in a malicious compliance kind of way. Given a Nix environment, it should be possible to pull down a list of every bit of code that was used to construct the OS. Would we have to differentiate between installed vs executed code? My Latex environment probably has thousands of packages, though I might directly only include a handful of them. Even if I include a Latex package, it might not get executed.

    The CITATION.cff format[0] is a newish format to solve the machine identification of citable works, but I suspect it is too new to see widespread adoption. It is going to take some backbreaking regexes to extract "How to Cite" sections embedded in READMEs and buried in the source.

    [0] https://citation-file-format.github.io/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts