TurboBench VS rapidgzip

Compare TurboBench vs rapidgzip and see what are their differences.

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
TurboBench rapidgzip
10 14
312 320
- -
8.9 9.5
9 months ago 8 days ago
C C++
- Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

TurboBench

Posts with mentions or reviews of TurboBench. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-26.
  • Zstd Content-Encoding planned to ship with Chrome 123
    1 project | news.ycombinator.com | 7 Feb 2024
    I'm still unconvinced about this addition. And I don't even dislike Zstandard.

    The main motivation seems to be that while Zstandard is worse than Brotli at the highest level, it's substantially faster than Brotli when data has to be compressed on the fly with a limited computation budget. That might be true, but I'm yet to see any concrete or even anecdotal evidence even in the issue tracker [1] while there exist some benchmarks where both Zstandard and Brotli are fast enough for the web usage even at lower levels [2].

    According to their FAQ [3] Meta and Akamai have successfully used Zstandard in their internal network, but my gut feeling is that they never actually tried to optimize Brotli instead. In fact, Meta employs the main author of Zstandard so it would have been easier to tune Zstandard instead of Brotli. While Brotli has some fundamental difference from Zstandard (in particular Brotli doesn't use arithmetic-equivalent coding), no one has concretely demonstrated that difference would prevent Brotli from being fast enough for dynamic contents in my opinion.

    [1] https://issues.chromium.org/issues/40196713

    [2] https://github.com/powturbo/TurboBench/issues/43

    [3] https://docs.google.com/document/d/14dbzMpsYPfkefAJos124uPrl...

  • TurboBench: Dynamic/Static web content compression benchmark
    1 project | news.ycombinator.com | 28 Aug 2023
    1 project | /r/compression | 11 Jul 2023
    1 project | news.ycombinator.com | 11 Jul 2023
  • Ebiggers/libdeflate: Heavily optimized DEFLATE/zlib/gzip library
    5 projects | news.ycombinator.com | 26 Aug 2023
    libdeflate compress better and has faster decompression than igzip.

    See the silesia single core in-memory benchmark here [1] comparing zlib,libdeflate,igzip,...

    https://github.com/powturbo/TurboBench/issues/4

  • Intel QuickAssist Technology Zstandard Plugin for Zstandard
    10 projects | news.ycombinator.com | 16 Aug 2023
    - https://github.com/powturbo/TurboBench/issues/43

    [1] https://github.com/powturbo/TurboBench

  • Variation on RLE to Achieve Lossless Compression for Tabular Data
    1 project | news.ycombinator.com | 12 Jun 2023
    Compressesing your sample file, we get 823 bytes with brotli

    Download TurboBench and make your own tests:

    [1] - https://github.com/powturbo/TurboBench

  • Data Compression Drives the Internet. Here’s How It Works
    1 project | news.ycombinator.com | 10 Jun 2023
    - igzip 1,2 is best for very fast networks > 10MB/s

    brotli bring little value at decompression for users

    [1] https://github.com/powturbo/TurboBench

    [1] https://sites.google.com/site/powturbo/home/web-compression

    [2] https://encode.su/threads/2333-TurboBench-Back-to-the-future...

  • Pigz: Parallel gzip for modern multi-processor, multi-core machines
    15 projects | news.ycombinator.com | 12 May 2023
    Build or download TurboBench [1] executables for linux and windows from releases [2] ans make your own tests comparing oodle,zstd and other compressors.

    [1] https://github.com/powturbo/TurboBench

    [2] https://github.com/powturbo/TurboBench/releases

rapidgzip

Posts with mentions or reviews of rapidgzip. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-09-04.
  • Show HN: Rapidgzip – Parallel Gzip Decompressing with 10 GB/S
    3 projects | news.ycombinator.com | 4 Sep 2023
  • Ebiggers/libdeflate: Heavily optimized DEFLATE/zlib/gzip library
    5 projects | news.ycombinator.com | 26 Aug 2023
    I also did benchmarks with zlib and libarchivemount via their library interface here [0]. It has been a while that I have run them, so I forgot. Unfortunately, I did not add libdeflate.

    [0] https://github.com/mxmlnkn/rapidgzip/blob/master/src/benchma...

  • Rapidgzip – Parallel Decompression and Seeking in Gzip (Knespel, Brunst – 2023) [pdf]
    3 projects | news.ycombinator.com | 21 Aug 2023
    Hi, author here.

    You are right in the index being the easy-mode. Over the years there have been lots of implementations trying to add an index like that to the gzip metadata itself or as a sidecar file, with bgzip probably being the most known one. None of them really did stick, hence the necessity for some generic multi-threaded decompressor. A probably incomplete list of such implementations can be found in this issue: https://github.com/mxmlnkn/rapidgzip/issues/8

    The index makes it so easy that I can simply delegate decompression to zlib. And since paper publication I've actually improved upon this by delegating to ISA-l / igzip instead, which is twice as fast. This is already in the 0.8.0 release.

    As derived from table 1, the false positive rate is 1 Tbit / 202 = 5 Gbit or 625 MB for deflate blocks with dynamic Huffman code. For non-compressed blocks, the false positive rate is roughly one per 500 KB, however non-compressed blocks can basically be memcpied or skipped over and then the next deflate header can be checked without much latency. On the other hand, for dynamic blocks, the whole block needs to be decompressed first to find the next one. So the much higher false positive rate for non-compressed blocks doesn't introduce that much overhead.

    I have some profiling built into rapidgzip, which is printed with -v, e.g., rapidgzip -v -d -o /dev/null 20xsilesia.tar.gz :

        Time spent in block finder              : 0.227751 s
  • Intel QuickAssist Technology Zstandard Plugin for Zstandard
    10 projects | news.ycombinator.com | 16 Aug 2023
  • Tool and Library for Parallel Gzip Decompression and Random Access
    1 project | news.ycombinator.com | 12 May 2023
  • Pigz: Parallel gzip for modern multi-processor, multi-core machines
    15 projects | news.ycombinator.com | 12 May 2023
    I have not only implemented parallel decompression but also random access to offsets in the stream with https://github.com/mxmlnkn/pragzip I did some benchmarks on some really beefy machines with 128 cores and was able to reach almost 20 GB/s decompression bandwidth. The single-core decoder has lots of potential for optimization because I had to write it from scratch, though.
  • Parquet: More than just “Turbo CSV”
    7 projects | news.ycombinator.com | 3 Apr 2023
    Decompression of arbitrary gzip files can be parallelized with pragzip: https://github.com/mxmlnkn/pragzip
  • The Cost of Exception Handling
    1 project | news.ycombinator.com | 13 Nov 2022
    At the very least you are duplicating logic without the exception. The check for eof has to be done implicitly anyway inside read because it has to fill the bit buffer with data from the byte buffer or the byte buffer with data from the file. And if both fail, then we already know the result of eof, so no need to duplicate checking for eof in the outer read calling loop.

    Here is the full commit with ad-hoc benchmark results in the commit message:

    https://github.com/mxmlnkn/pragzip/commit/0b1af498377838c30f...

    and here the benchmarks I ran at that time:

    https://github.com/mxmlnkn/pragzip/blob/0b1af498377838c30fea...

    As you can see, it's part of my random-seekable multi-threaded gzip and bzip2 parallel decompression libraries.

    What you can also see in the commit message is that it wasn't a 50% time reduction but a 50% bandwidth increase, which would translate to a 30% time reduction. It seems I remembered that partly wrong. But it still was a significant optimization for me.

  • How Much Faster Is Making a Tar Archive Without Gzip?
    8 projects | news.ycombinator.com | 10 Oct 2022
  • Show HN: Thread-Parallel Decompression and Random Access to Gzip Files (Pragzip)
    1 project | news.ycombinator.com | 6 Aug 2022

What are some alternatives?

When comparing TurboBench and rapidgzip you can also consider the following projects:

QAT-ZSTD-Plugin

pigz - A parallel implementation of gzip for modern multi-processor, multi-core machines.

libdeflate - Heavily optimized library for DEFLATE/zlib/gzip compression and decompression

DirectStorage - DirectStorage for Windows is an API that allows game developers to unlock the full potential of high speed NVMe drives for loading game assets.

QATzip - Compression Library accelerated by Intel® QuickAssist Technology

lib842

parquet-format - Apache Parquet

nvcomp - Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.

FPC - FPC - Fast Prefix Coder

pixz - Parallel, indexed xz compressor