Our great sponsors
-
FYI I ran on the worldcities data at https://github.com/petewarden/dstkdata (credit to xsv for choosing that dataset) against https://github.com/BurntSushi/xsv and https://github.com/liquidaty/zsv (full disclosure: I am one of the zsv authors). Here's what I got.
fastest to slowest: zsv (0.07), xsv (0.16), goawk (0.42), python (~1.6)
Obviously, does not tell the whole story as this test was limited to "count" and an interpreted language is expected to always be slower compared to a precompiled command, but, it might be relevant to a user deciding what tool to use. Also, might be instructive as to room for improvement in the go code (or possibly the go code could use the c lib)-- I note that even if the goawk command is '{}' the runtime is still about the same.
full results:
goawk:
-
FYI I ran on the worldcities data at https://github.com/petewarden/dstkdata (credit to xsv for choosing that dataset) against https://github.com/BurntSushi/xsv and https://github.com/liquidaty/zsv (full disclosure: I am one of the zsv authors). Here's what I got.
fastest to slowest: zsv (0.07), xsv (0.16), goawk (0.42), python (~1.6)
Obviously, does not tell the whole story as this test was limited to "count" and an interpreted language is expected to always be slower compared to a precompiled command, but, it might be relevant to a user deciding what tool to use. Also, might be instructive as to room for improvement in the go code (or possibly the go code could use the c lib)-- I note that even if the goawk command is '{}' the runtime is still about the same.
full results:
goawk:
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
zsv
zsv+lib: world's fastest (simd) CSV parser, bare metal or wasm, with an extensible CLI for SQL querying, format conversion and more
FYI I ran on the worldcities data at https://github.com/petewarden/dstkdata (credit to xsv for choosing that dataset) against https://github.com/BurntSushi/xsv and https://github.com/liquidaty/zsv (full disclosure: I am one of the zsv authors). Here's what I got.
fastest to slowest: zsv (0.07), xsv (0.16), goawk (0.42), python (~1.6)
Obviously, does not tell the whole story as this test was limited to "count" and an interpreted language is expected to always be slower compared to a precompiled command, but, it might be relevant to a user deciding what tool to use. Also, might be instructive as to room for improvement in the go code (or possibly the go code could use the c lib)-- I note that even if the goawk command is '{}' the runtime is still about the same.
full results:
goawk:
-
I've now updated the benchmarks to avoid huge.csv: the write benchmarks now write a big 1GB, 20-column CSV and the read benchmarks use that same CSV: https://github.com/benhoyt/goawk/commit/07eb4505a7f64cffceeb...