Miller CLI – Like Awk, sed, cut, join, and sort for CSV, TSV and JSON

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • miller

    Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

    Comparison between their ongoing C version [1] and upcoming go version [2] is nice to see. The README talks a bit more the performance comparison and the C version's strength, but still.

    [1] https://github.com/johnkerl/miller/tree/main/c

    [2] https://github.com/johnkerl/miller/tree/main/go

  • rq

    Record Query - A tool for doing record analysis and transformation (by dflemstr)

    There's also rq (record query)[1] that also supports CSV and JSON but not TSV though. It's written in Rust.

    [1] https://github.com/dflemstr/rq

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • jq

    Discontinued Command-line JSON processor [Moved to: https://github.com/jqlang/jq] (by stedolan)

    i recently came across jq, which i use for json parsing: https://stedolan.github.io/jq/

  • nushell

    A new type of shell

    Sounds like it. Another option might be nushell. https://www.nushell.sh/

  • vnlog

    Process labelled tabular ASCII data using normal UNIX tools

    Similar, but using the ACTUAL awk, sed, join, sort tools you already have and know about: https://github.com/dkogan/vnlog/

  • DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

    Not exactly the same, but we wrote a library to easily load any delimited type of file and finds header (even if not first row). It also works to load JSON, Parquet, AVRO and loads it into a dataframe. Not CLI exactly, but pretty easy:

    https://github.com/capitalone/dataprofiler

    Anyway, pretty interesting Miller CLI

  • RecordStream

    commandline tools for slicing and dicing JSON records.

    I don't know about MillerCLI's portability, but RecordStream (https://github.com/benbernard/RecordStream) is my go to swiss army knife.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts