Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 C Compression Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
cute_headers
Collection of cross-platform one-file C/C++ libraries with no dependencies, primarily used for games
-
cstore_fdw
Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
astc-encoder
The Arm ASTC Encoder, a compressor for the Adaptive Scalable Texture Compression data format.
-
p7zip
A new p7zip fork with additional codecs and improvements (forked from https://sourceforge.net/projects/sevenzip/ AND https://sourceforge.net/projects/p7zip/).
-
lizard
Lizard (formerly LZ5) is an efficient compressor with very fast decompression. It achieves compression ratio that is comparable to zip/zlib and zstd/brotli (at low and medium compression levels) at decompression speed of 1000 MB/s and faster. (by inikep)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Rethinking string encoding: a 37.5% space efficient encoding than UTF-8 in Fury | news.ycombinator.com | 2024-05-07> In such cases, the serialized binary are mostly in 200~1000 bytes. Not big enough for zstd to work
You're not referring to the same dictionary that I am. Look at --train in [1].
If you have a training corpus of representative data, you can generate a dictionary that you preshare on both sides which will perform much better for very small binaries (including 200-1k bytes).
If you want maximum flexibility (i.e. you don't know the universe of representative messages ahead of time or you want maximum compression performance), you can gather this corpus transparently as messages are generated & then generate a dictionary & attach it as sideband metadata to a message. You'll probably need to defer the decoding if it references a dictionary not yet received (i.e. send delivers messages out-of-order from generation). There are other techniques you can apply, but the general rule is that your custom encoding scheme is unlikely to outperform zstd + a representative training corpus. If it does, you'd need to actually show this rather than try to argue from first principles.
[1] https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md
Opus doesn't support 44.1 kHz because compatibility and effort/benefit ratio:
https://github.com/xiph/opus/issues/43
The browser audio limitation is presumably a workaround to some bug or performance limitation that was relevant at some point in history (the site was created in 2014).
Project mention: Moving a Billion Postgres Rows on a $100 Budget | news.ycombinator.com | 2024-02-21Columnar store PostgreSQL extension exists, here are two but I think I’m missing at least another one:
https://github.com/citusdata/cstore_fdw
https://github.com/hydradatabase/hydra
You can also connect other stores using the foreign data wrappers, like parquet files stored on an object store, duckdb, clickhouse… though the joins aren’t optimised as PostgreSQL would do full scan on the external table when joining.
Project mention: Show HN: Pzip- blazing fast concurrent zip archiver and extractor | news.ycombinator.com | 2023-09-24Please note that allowing for 2% bigger resulting file could mean huge speedup in these circumstances even with the same compression routines, seeing these benchmarks of zlib and zlib-ng for different compression levels:
https://github.com/zlib-ng/zlib-ng/discussions/871
IMO the fair comparison of the real speed improvement brought by a new program is only between the almost identical resulting compressed sizes.
Project mention: Intel QuickAssist Technology Zstandard Plugin for Zstandard | news.ycombinator.com | 2023-08-16It's obsolete. It's limited to 32KB LZ window with huffman coding. Zstd can use a much larger window (8MB recommended) and a much better entropy coder: https://github.com/Cyan4973/FiniteStateEntropy
For a benchmark on a standard set: https://github.com/inikep/lzbench/blob/master/lzbench18_sort...
Project mention: Show HN: Time Series Benchmark TurboPFor,TurboFloat,TurboFloat LzX,TurboGorilla | news.ycombinator.com | 2023-06-25
Kamila Szewczyk is working on a bzip3 to improve the state-of-the-art in the domain of compressors based on Burrows-Wheeler:
https://github.com/kspalaiologos/bzip3
I’m keeping fingers crossed for the project. Especially given that the author is 19 and her best work is yet to come.
C Compression related posts
-
LuaRT 1.8.0 – open-source Windows programming framework for Lua
-
Ask HN: Why are people so mean in the open source community? (about xz again)
-
VDO: Userspace tools for pools of deduplicated and compressed block storage
-
Rethinking string encoding: a 37.5% space efficient encoding than UTF-8 in Fury
-
Drink Me: (Ab)Using a LLM to Compress Text
-
FC8 – Faster 68K Decompression (2016)
-
SQLite VFS for ZSTD seekable format
-
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Jun 2024
Index
What are some of the best open-source Compression projects in C? This list will help you:
Project | Stars | |
---|---|---|
1 | zstd | 22,581 |
2 | LZ4 | 9,312 |
3 | ZLib | 5,346 |
4 | cute_headers | 4,129 |
5 | opus | 2,142 |
6 | LZFSE | 1,759 |
7 | cstore_fdw | 1,738 |
8 | zlib-ng | 1,453 |
9 | zip | 1,333 |
10 | FiniteStateEntropy | 1,263 |
11 | Minizip-ng | 1,173 |
12 | smaz | 1,131 |
13 | astc-encoder | 997 |
14 | c-blosc | 963 |
15 | lzbench | 848 |
16 | TurboPFor | 746 |
17 | p7zip | 744 |
18 | squashfs-tools | 719 |
19 | bzip3 | 650 |
20 | lizard | 639 |
21 | zson | 527 |
22 | simdcomp | 476 |
23 | gozstd | 419 |
Sponsored