Modernizing AWK, a 45-year old language, by adding CSV support

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • tsv-utils

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    includes csv to tsv: https://github.com/eBay/tsv-utils

    HT: https://simonwillison.net/

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • usv

    Unicode Separated Values (USV) data markup for units, records, groups, files, streaming, and more.

    Ben this is great, thank you. Would you be open to adding Unicode Separated Values (USV) as well? It's much like CSV and also simpler because of no escaping and no quoting. I can donate $50 to you or your charity of choice as a token of thanks and encouragement.

    https://github.com/sixarm/usv

  • nio

    Low Overhead Numerical/Native IO library & tools (by c-blake)

  • miller

    Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

    I have grown fond of using miller[0] to handle command line data processing. Handles the standard tabular formats (csv, tsv, json) and has all of the standard data cleanup options. Works on streams so (most operations) are not limited by memory.

    [0]: https://github.com/johnkerl/miller

  • simonwillisonblog

    The source code behind my blog

    For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    includes csv to tsv: https://github.com/eBay/tsv-utils

    HT: https://simonwillison.net/

  • xsv

    A fast CSV command line toolkit written in Rust.

    xsv does similar stuff for CSV, and very rapidly: https://github.com/BurntSushi/xsv

    https://miller.readthedocs.io/en/latest/why/ has a nice section on "why miller":

    > First: there are tools like xsv which handles CSV marvelously and jq which handles JSON marvelously, and so on -- but I over the years of my career in the software industry I've found myself, and others, doing a lot of ad-hoc things which really were fundamentally the same except for format. So the number one thing about Miller is doing common things while supporting multiple formats: (a) ingest a list of records where a record is a list of key-value pairs (however represented in the input files); (b) transform that stream of records; (c) emit the transformed stream -- either in the same format as input, or in a different format.

  • csvquote

  • qsv

    CSVs sliced, diced & analyzed.

    I was using xsv a lot at work (it is so much faster than csvkit) but I've recently jumped to qsv, a fork with more features.

    https://github.com/jqnatividad/qsv

  • zsv

    zsv+lib: tabular data swiss-army knife CLI + world's fastest (simd) CSV parser

    This is not an outlier. `mlr` is quite slow, literally off-the-charts slow for our purposes when we benchmarked vs xsv and zsv (see https://github.com/liquidaty/zsv. disclaimer: I'm one of its authors)

  • goawk

    A POSIX-compliant AWK interpreter written in Go, with CSV support

    I've also added explicit tests for ASCII and Unicode unit and record separators, just to ensure I don't regress: https://github.com/benhoyt/goawk/commit/215652de58f33630edb0...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [OC]Tidy Viewer (tv) is a cross-platform csv pretty printer that uses column styling to maximize viewer enjoyment.

    5 projects | /r/commandline | 11 Aug 2021
  • Qsv: Efficient CSV CLI Toolkit

    8 projects | news.ycombinator.com | 22 Dec 2023
  • What are the most useful VSCode extensions you know which could be reimplemented in Emacs?

    13 projects | /r/emacs | 31 Mar 2021
  • Shell Cacophony

    4 projects | dev.to | 29 Aug 2024
  • Ask HN: How would you chunk a large Excel file?

    1 project | news.ycombinator.com | 26 May 2024