miller VS tsv-utils

Compare miller vs tsv-utils and see what are their differences.

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
miller tsv-utils
63 9
8,510 1,396
- 0.6%
9.1 0.0
8 days ago over 1 year ago
Go D
GNU General Public License v3.0 or later Boost Software License 1.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

miller

Posts with mentions or reviews of miller. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-22.

tsv-utils

Posts with mentions or reviews of tsv-utils. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-02.
  • Tracking SQLite Database Changes in Git
    7 projects | news.ycombinator.com | 2 Nov 2023
    You might want to look at tsv-utils, or a similar project: https://github.com/eBay/tsv-utils

    For the SQL part, but maybe a lot heavier, you can use one of the projects listed on this page: https://github.com/multiprocessio/dsq (No longer maintained, but has links to lots of other projects)

  • Splitting CSV files at 3GB/s
    3 projects | news.ycombinator.com | 20 Jun 2022
  • Modernizing AWK, a 45-year old language, by adding CSV support
    11 projects | news.ycombinator.com | 12 May 2022
    For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.

    eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.

    includes csv to tsv: https://github.com/eBay/tsv-utils

    HT: https://simonwillison.net/

    11 projects | news.ycombinator.com | 12 May 2022
    When you have "format wars", the best idea is usually to have a converter program change to the easiest to work with format - unless this incurs a massive expansion in space as per some image/video formats.

    With CSV-like data, bulk conversion from quoted-escaped RFC4180 CSV to a simpler-to-parse format is the best plan for several reasons. First, it may "catch on", help Microsoft/R/whoever embrace the format and in doing so squash many bugs written by "data analyst/scientist coders". Second, in a shell a|b run programs a & b in parallel on multi-core and allow things like csv2x|head -n10000|b. Third, bulk conversion to a random access file where literal delimiters cannot occur as non-delimiters allows trivial file segmentation to be nCores times faster (under often satisfied assumptions). There are some D tools for this in https://github.com/eBay/tsv-utils and a much smaller stand-alone Nim tool https://github.com/c-blake/nio/blob/main/utils/c2tsv.nim . Optional quoting was always going to be a PITA due to its non-locality. What if there is no quote anywhere? Fourth, by using a program as the unit of modularity in this case, you make things programming language agnostic. Someone could go to town and write a pure SIMD/AVX512 converter in assembly even and solve the problem "once and for all" on a given CPU. The problem is actually just simple enough that this smells possible.

    I am unaware of any "document" that "standardizes" this escaped/lossless TSV format. { Maybe call it "DSV" for delimiter separated values where "delimiters actually separate"? } Someone want to write an RFC or point to one? It can be just as "general/lossless" (see https://news.ycombinator.com/item?id=31352170).

    Of course, if you are going to do a lot of data processing against some data, it is even better to parse all the way to down to binary so that you never have to parse again (Well, unless you call CPUs loading registers "parsing") which is what database systems have been doing since the 1960s.

  • [OC]Tidy Viewer (tv) is a cross-platform csv pretty printer that uses column styling to maximize viewer enjoyment.
    5 projects | /r/commandline | 11 Aug 2021
    tsv-utils - Command line csv data manipulation toolkit. D

What are some alternatives?

When comparing miller and tsv-utils you can also consider the following projects:

visidata - A terminal spreadsheet multitool for discovering and arranging data

xsv - A fast CSV command line toolkit written in Rust.

jq - Command-line JSON processor [Moved to: https://github.com/jqlang/jq]

dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

csvtk - A cross-platform, efficient and practical CSV/TSV toolkit in Golang

yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor

csvq - SQL-like query language for csv

json-toolkit - "the best opensource converter I've found across the Internet" -- dene14

gron - Make JSON greppable!

dextool - Suite of C/C++ tooling built on LLVM/Clang

awesome-cli-apps - 🖥 📊 🕹 🛠 A curated list of command line apps

csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.