Our great sponsors
-
miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
The 10x number was before improvements on https://github.com/johnkerl/miller/pull/786 et al. The earlier negative perf results were my fault, not Go's -- I was focusing initially on the port and feature development, leaving benchmarking and optimization until the end. That said, Go is a bit slower than C line for line; however, Miller 5 (in C) was single-threaded and Miller 6 (in Go) actively uses multicore. This is why complex processing chains now run much quicker in Go than in C -- due to multicore and pipelining which are much easier to do in Go.
-
It's interesting watching these types of tools get re-invented periodically:
https://github.com/benbernard/RecordStream
It shows the unix model of many small, composable tools is very powerful, but also shows that POSIX is missing some essential pieces that everyone keeps trying to add/reinvent.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
My team built a similar tool in Python to load any delimited file, json, parquet and Avro with one command:
https://github.com/capitalone/DataProfiler
Effectively loads anything into a dataframe
-
I just published dsq [0] for running SQL queries against CSV/JSON/Excel/Parquet/etc or just converting those files to JSON.
[0] https://github.com/multiprocessio/datastation/tree/main/runn...