-
tsv-utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
includes csv to tsv: https://github.com/eBay/tsv-utils
HT: https://simonwillison.net/
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
usv
Unicode Separated Values (USV) data markup for units, records, groups, files, streaming, and more.
Ben this is great, thank you. Would you be open to adding Unicode Separated Values (USV) as well? It's much like CSV and also simpler because of no escaping and no quoting. I can donate $50 to you or your charity of choice as a token of thanks and encouragement.
https://github.com/sixarm/usv
-
-
miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
I have grown fond of using miller[0] to handle command line data processing. Handles the standard tabular formats (csv, tsv, json) and has all of the standard data cleanup options. Works on streams so (most operations) are not limited by memory.
[0]: https://github.com/johnkerl/miller
-
For anything down and dirty, what's wrong with -F'"'? For anything fancy there are plenty of things like the below.
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
includes csv to tsv: https://github.com/eBay/tsv-utils
HT: https://simonwillison.net/
-
xsv does similar stuff for CSV, and very rapidly: https://github.com/BurntSushi/xsv
https://miller.readthedocs.io/en/latest/why/ has a nice section on "why miller":
> First: there are tools like xsv which handles CSV marvelously and jq which handles JSON marvelously, and so on -- but I over the years of my career in the software industry I've found myself, and others, doing a lot of ad-hoc things which really were fundamentally the same except for format. So the number one thing about Miller is doing common things while supporting multiple formats: (a) ingest a list of records where a record is a list of key-value pairs (however represented in the input files); (b) transform that stream of records; (c) emit the transformed stream -- either in the same format as input, or in a different format.
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
I was using xsv a lot at work (it is so much faster than csvkit) but I've recently jumped to qsv, a fork with more features.
https://github.com/jqnatividad/qsv
-
This is not an outlier. `mlr` is quite slow, literally off-the-charts slow for our purposes when we benchmarked vs xsv and zsv (see https://github.com/liquidaty/zsv. disclaimer: I'm one of its authors)
-
I've also added explicit tests for ASCII and Unicode unit and record separators, just to ensure I don't regress: https://github.com/benhoyt/goawk/commit/215652de58f33630edb0...
Related posts
-
[OC]Tidy Viewer (tv) is a cross-platform csv pretty printer that uses column styling to maximize viewer enjoyment.
-
XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal
-
Qsv: Efficient CSV CLI Toolkit
-
What are the most useful VSCode extensions you know which could be reimplemented in Emacs?
-
A Love Letter to the CSV Format