Select, put and delete data from JSON, TOML, YAML, XML and CSV files

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

miller

63 8,542 9.1 Go

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

I'm a big fan of miller (mlr) -- it's the tool I landed on when I needed to "graduate" from awk to look at CSV data. But when I read "go based" in your comment, I thought "nope, it's written in C". But no! It was ported to go -- very interesting!
The developer wrote a comprehensive document explaining the rationale behind the porting that answered all my questions and a lot more: https://github.com/johnkerl/miller/blob/main/README-go-port.....
Thought other miller/mlr fans (that don't follow its development) might find this interesting as well.
(The dasel tool looks very cool, too -- looks like a good complement to mlr and similar tools!)

dasel

44 4,856 8.2 Go

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
yj

6 914 3.6 Go

CLI - Convert between YAML, TOML, JSON, and HCL. Preserves map order.
jq

306 25,063 0.0 C

Discontinued Command-line JSON processor [Moved to: https://github.com/jqlang/jq] (by stedolan)
brackit

21 46 6.9 Java

Query processor with proven optimizations, ready to use for your JSON store to query semi-structured data with JSONiq. Can also be used as an ad-hoc in-memory query processor.
sirix

44 1,076 9.2 Java

SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach.

Regarding XQuery we just added JSON querying on top in Brackit[1] / SirixDB[2].
Brackit is a retargetable query compiler and does a lot of optimizations at compile time as for instance optimizing joins and aggregations. It is useable as an in-memory processor or as a query processor of a database system.
The Ph.D. thesis of Sebastian:
Separating Key Concerns in Query Processing - Set Orientation, Physical Data Independence, and Parallelism
http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/publicatio...
[1] http://brackit.io
[2] https://sirix.io

flatterer

14 164 6.6 Rust

Opinionated JSON to CSV/XLSX/SQLITE/PARQUET converter. Flattens JSON fast.

Try this: https://flatterer.opendata.coop/
There is no binary yet but there is a python CLI and library, even though it is written in rust.
It is the only tool that I know that deals with nested JSON and converts it into relational tables.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
json2csv

1 799 2.6 Go

command line tool to convert json to csv (by jehiah)

In addition to the already mentioned jq, there's https://github.com/jehiah/json2csv

jellex

11 93 3.8 Python

TUI to filter JSON and JSON Lines data with Python syntax

You could do something like this in pure python without the json loading boilerplate with jello[0]. An interactive TUI for jello called jellex[1} is also available. (I am the author)
[0] https://github.com/kellyjonbrazil/jello
[1] https://github.com/kellyjonbrazil/jellex

flatten-tool

2 101 5.9 Python

Tools for generating CSV and other flat versions of the structured data

* https://flatten-tool.readthedocs.io/en/latest/
It's maintained by Open Data Services Coop, where we use it as a component in several of our web & data pipeline tools for working with data that is published in a Data Standard.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project