Modernizing AWK, a 45-year old language, by adding CSV support

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
  1. tsv-utils

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    includes csv to tsv: https://github.com/eBay/tsv-utils

    HT: https://simonwillison.net/

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. usv

    Unicode Separated Values (USV) data markup for units, records, groups, files, streaming, and more.

    Ben this is great, thank you. Would you be open to adding Unicode Separated Values (USV) as well? It's much like CSV and also simpler because of no escaping and no quoting. I can donate $50 to you or your charity of choice as a token of thanks and encouragement.

    https://github.com/sixarm/usv

  4. nio

    Low Overhead Numerical/Native IO library & tools (by c-blake)

  5. miller

    Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

    I have grown fond of using miller[0] to handle command line data processing. Handles the standard tabular formats (csv, tsv, json) and has all of the standard data cleanup options. Works on streams so (most operations) are not limited by memory.

    [0]: https://github.com/johnkerl/miller

  6. simonwillisonblog

    The source code behind my blog

    For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    includes csv to tsv: https://github.com/eBay/tsv-utils

    HT: https://simonwillison.net/

  7. xsv

    Discontinued A fast CSV command line toolkit written in Rust.

    xsv does similar stuff for CSV, and very rapidly: https://github.com/BurntSushi/xsv

    https://miller.readthedocs.io/en/latest/why/ has a nice section on "why miller":

    > First: there are tools like xsv which handles CSV marvelously and jq which handles JSON marvelously, and so on -- but I over the years of my career in the software industry I've found myself, and others, doing a lot of ad-hoc things which really were fundamentally the same except for format. So the number one thing about Miller is doing common things while supporting multiple formats: (a) ingest a list of records where a record is a list of key-value pairs (however represented in the input files); (b) transform that stream of records; (c) emit the transformed stream -- either in the same format as input, or in a different format.

  8. csvquote

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. qsv

    Blazing-fast Data-Wrangling toolkit

    I was using xsv a lot at work (it is so much faster than csvkit) but I've recently jumped to qsv, a fork with more features.

    https://github.com/jqnatividad/qsv

  11. zsv

    zsv+lib: tabular data swiss-army knife CLI + world's fastest (simd) CSV parser

    This is not an outlier. `mlr` is quite slow, literally off-the-charts slow for our purposes when we benchmarked vs xsv and zsv (see https://github.com/liquidaty/zsv. disclaimer: I'm one of its authors)

  12. goawk

    A POSIX-compliant AWK interpreter written in Go, with CSV support

    I've also added explicit tests for ASCII and Unicode unit and record separators, just to ensure I don't regress: https://github.com/benhoyt/goawk/commit/215652de58f33630edb0...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [OC]Tidy Viewer (tv) is a cross-platform csv pretty printer that uses column styling to maximize viewer enjoyment.

    5 projects | /r/commandline | 11 Aug 2021
  • XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal

    6 projects | news.ycombinator.com | 27 Mar 2025
  • Qsv: Efficient CSV CLI Toolkit

    8 projects | news.ycombinator.com | 22 Dec 2023
  • What are the most useful VSCode extensions you know which could be reimplemented in Emacs?

    13 projects | /r/emacs | 31 Mar 2021
  • A Love Letter to the CSV Format

    18 projects | news.ycombinator.com | 26 Mar 2025

Did you know that Go is
the 4th most popular programming language
based on number of references?