Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Turbo-Range-Coder
TurboRC - Fastest Range Coder + Arithmetic Coding / Fastest Asymmetric Numeral Systems
As general purpose compressor iguana is decompressing a lot slower than advertised when tested with a typical data compression corpus.
- Benchmark from encode.su experts https://encode.su/threads/4041-Iguana-a-fast-vectorized-comp...
- benchmark from the iguana developper here: https://github.com/SnellerInc/sneller/tree/master/cmd/iguana...
Silesia corpus / cpu Xeon Gold 5320
zstd -b3 3.186 943.9 MB/s
zstd -b9 3.574 1015.8 MB/s
zstd -b18 3.967 910.6 MB/s
lz4 -b1 2.101 3493.8 MB/s
lz4 -b5 2.687 3323.5 MB/s
lz4 -b9 2.721 3381.5 MB/s
iguana -t=0 2.58 4450 MB/s
iguana -t=1 3.11 2260 MB/s
As you can see, iguana with entropy coding enabled (-t 1) has a similar compression ratio to zstd -3, but it decompresses more than twice as quickly. With entropy coding disabled (-t 0), iguana has a compression ratio roughly equivalent to lz4 -5 and decompresses about 33% faster.
- the code was non-perfect, and the fuzzer has found issues.
LZSSE library was abandoned five years ago, but they have great blog posts to read: https://github.com/ConorStokes/LZSSE
Iguana looks promising, but AVX-512 requirement is too restrictive. We need something to work both on x86 and ARM. Also, integrating Go assembly into other software is not easy. And A-GPL license makes it incompatible.
This is amazing news!
PS. Interesting if someone has experience integrating Go Asm dialect into C or C++ build system? We needed it for another experiment https://github.com/ClickHouse/ClickHouse/issues/45130#issuec...