Our great sponsors
-
solaris-userland
Open Source software in Solaris using gmake based build system to drive building various software components.
-
DirectStorage
DirectStorage for Windows is an API that allows game developers to unlock the full potential of high speed NVMe drives for loading game assets.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Moby
The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
-
zip.js
JavaScript library to zip and unzip files supporting multi-core compression, compression streams, zip64, split files and encryption.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
nvcomp
Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.
You can grab the version from the solaris userland repo I linked and use it without me completing a homework assignment. Just grab the pigz-2.3.4 source then apply the patches from [1] in the proper order. Maybe some of them aren't needed for non-Solaris.
1. https://github.com/oracle/solaris-userland/tree/master/compo...
I thought I had opened a PR for that a long while ago, but it doesn't show up on github these days. In any case, I did ask Mark Adler to review it. It was never a priority, then the code changed in ways that I don't really want to deal with.
While looking through the PRs, I noticed a PR for Blocked GZip Format (BGZF) [2]. That's very interesting, and perhaps suggests that bgzip is a tool you would be interested in.
2. https://github.com/madler/pigz/pull/19
If you are interested in optimizing parallel decompression and you happen to have a suitable NVIDIA GPU, GDeflate [1] is interesting. The target market for this is PC games using DirectStorage to quickly load game assets. The graph in [1] shows DirectStorage maxing out the throughput of a PCIe Gen 3 drive at about 3 GiB/s when compression is not used.
If you have suitable hardware running Windows, you can try this out for yourself using Microsoft's DirectStorage GPU decompression benchmark [2].
A reference implementation of a single threaded compressor and multi (CPU) threaded decompressor can be found at [3]. It is Apache-2 licensed.
1. https://developer.nvidia.com/blog/accelerating-load-times-fo...
2. https://github.com/microsoft/DirectStorage/tree/main/Samples...
3. https://github.com/microsoft/DirectStorage/blob/main/GDeflat...
Disclaimer: I work for NVIDIA, have nothing to do with this, and am not speaking for NVIDIA.
I have not only implemented parallel decompression but also random access to offsets in the stream with https://github.com/mxmlnkn/pragzip I did some benchmarks on some really beefy machines with 128 cores and was able to reach almost 20 GB/s decompression bandwidth. The single-core decoder has lots of potential for optimization because I had to write it from scratch, though.
Useful with Docker, see https://github.com/moby/moby/pull/35697
I’ve integrated pigz into different build and CI pipelines a few times. Don’t expect wonders since some steps still need to run serially, but a few seconds here and there might still add up to a few minutes on a large build.
Similarly, if people are interested, I have coded the possibility to compress zip files on several cores in zip.js [1]. The approach is simpler as it consists of compressing the entries in parallel. It still offers a significant performance gain though when compressing multiple files in a zip file, which is often the nominal case.
[1] https://github.com/gildas-lormeau/zip.js
Containerd will utilize unpigz if it’s on your PATH, thank me later: https://github.com/containerd/containerd/blob/main/archive/c...
Interesting. It looks like https://github.com/zrajna/zindex became public about a year after my searches for parallel uncompression came up empty and I started hacking on pigz.
Build or download TurboBench [1] executables for linux and windows from releases [2] ans make your own tests comparing oodle,zstd and other compressors.
[1] https://github.com/powturbo/TurboBench
[2] https://github.com/powturbo/TurboBench/releases
That's really confusing since `pixz` exists and its "pixie" pronunciation actually works
https://github.com/vasi/pixz
You can grab the version from the solaris userland repo I linked and use it without me completing a homework assignment. Just grab the pigz-2.3.4 source then apply the patches from [1] in the proper order. Maybe some of them aren't needed for non-Solaris.
1. https://github.com/oracle/solaris-userland/tree/master/compo...
I thought I had opened a PR for that a long while ago, but it doesn't show up on github these days. In any case, I did ask Mark Adler to review it. It was never a priority, then the code changed in ways that I don't really want to deal with.
While looking through the PRs, I noticed a PR for Blocked GZip Format (BGZF) [2]. That's very interesting, and perhaps suggests that bgzip is a tool you would be interested in.
2. https://github.com/madler/pigz/pull/19