Parallelising Huffman decoding and x86 disassembly by synchronising prefix codes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • dietgpu

    GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compression of numerical and other data types in HPC/ML applications.

  • ANS is super fast and trivially parallizable, faster than Huffman or especially arithmetic encoding. It is fast because it can be machine word oriented (you can read/write whole machine word sizes at a time, not arbitrary/variable bit length sequences), and as a result you can interleave any number of independent (parallel) encoders in the same stream with just a prefix sum to figure out where to write the state normalization values. I for one got up to 400 GB/s throughput on A100 GPUs in my implementation (https://github.com/facebookresearch/dietgpu).

    ANS can also self-synchronize as well.

  • gpuhd

    Massively Parallel Huffman Decoding on GPUs

  • https://github.com/weissenberger/gpuhd

    The authors of this repo/paper use the self-synchronizing property of almost all Huffman codes to implement parallel Huffman decoding on the GPU at ~10 GB/s. In practice, I haven't found this to be useful to do Huffman decoding on the CPU, since the GPU round-trip outweighs the speed of the GPU. But if your data is already on the GPU, this is a really cool way to to Huffman decoding.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts