InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises. Learn more →
Top 23 C Compression Projects
-
Interesting to match with the observations from the practice of using ClickHouse[1][2] for time series:
1. Reordering to SOA helps a lot - this is the whole point of column-oriented databases.
2. Specialized codecs like Gorilla[3], DoubleDelta[4], and FPC[5] lose to simply using ZSTD[6] compression in most cases, both in compression ratio and in performance.
3. Specialized time-series DBMS like InfluxDB or TimescaleDB lose to general-purpose relational OLAP DBMS like ClickHouse [7][8][9].
[1] https://clickhouse.com/blog/optimize-clickhouse-codecs-compr...
[2] https://github.com/ClickHouse/ClickHouse
[3] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[4] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[5] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[6] https://github.com/facebook/zstd/
[7] https://arxiv.org/pdf/2204.09795.pdf "SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things" (2022)
[8] https://gitlab.com/gitlab-org/incubation-engineering/apm/apm... https://gitlab.com/gitlab-org/incubation-engineering/apm/apm...
[9] https://www.sciencedirect.com/science/article/pii/S187705091...
-
We serve your site from a global cache location close to your visitors to make sure your site loads fast. In addition, we use an advanced HTML and text compression algorithm called Brotli. Compressed content is now cached, so we can send it directly to your visitors instead of compressing each request individually. In our tests this often improves loading speed by up to 2x, which will have a very positive impact on your Lighthouse scores like LCP. This will be especially noticeable on larger sites, so you can scale your site without worry.
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
Project mention: Micron Unveils 24GB and 48GB DDR5 Memory Modules | AMD EXPO and Intel XMP 3.0 compatible | reddit.com/r/gadgets | 2023-01-21
Yeah, sure, when you have monster core counts. on regular systems, not so much, here's from their own github page. it achieves, eh, 5GB/s on memory to memory transfers, i.e. best case scenario. so, uh, no? i'm not even sure it's any better than the CPU decompressor one Nvidia used.
-
A minimal viable Deflate decompressor is not exactly complex[1], although slower than mainline zlib.
-
cute_headers
Collection of cross-platform one-file C/C++ libraries with no dependencies, primarily used for games
Project mention: How many colors are too many colors for Windows Terminal? | news.ycombinator.com | 2022-05-14- https://github.com/RandyGaul/cute_headers/blob/master/cute_s...
It's a simple and relatively straightforward approach that a sufficiently bright programmer would come up in their own while looking at the design constraints though, so overall I find it a bit meaningless to find the ultimate person for the "original idea".
-
-
cstore_fdw
Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method.
That appears to be the case:
https://github.com/citusdata/cstore_fdw
>Important notice: Columnar storage is now part of Citus
-
SonarQube
Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.
-
If there are multiple tags with the same name, Ffmpeg will only use the last tag. If you really need to have multiple tags with the same name in your OPUS files, use opusenc instead (https://opus-codec.org/). Beware that some playback software does not display multiple artists gracefully.
-
Zlib-ng doesn't contain the same code, but it appears that their equivalent inflate() when used with their inflateGetHeader() implementation was affected by a similar problem: https://github.com/zlib-ng/zlib-ng/pull/1328
Also similarly, most client code will be unaffected because `state->head` will be NULL, because they (most client code) won't have used inflateGetHeader() at all.
-
Sounds similar to: https://github.com/Cyan4973/FiniteStateEntropy
https://arxiv.org/abs/1311.2540
> The modern data compression is mainly based on two approaches to entropy coding: Huffman (HC) and arithmetic/range coding (AC). The former is much faster, but approximates probabilities with powers of 2, usually leading to relatively low compression rates. The latter uses nearly exact probabilities - easily approaching theoretical compression rate limit (Shannon entropy), but at cost of much larger computational cost.
-
-
-
-
Project mention: WASM compression benchmarks and the cost of missing compression APIs | news.ycombinator.com | 2023-02-02
Related to compressing data before storing on SSD:
Blosc - faster than memcpy()
https://github.com/Blosc/c-blosc
On right circumstances Blosc is so fast that even speed ups reading data from RAM (read less, decompress in L1 and L2 caches)
-
Project mention: Lizard – efficient compression with fast decompression | news.ycombinator.com | 2022-05-24
Note that a benchmark in the README refers to zstd 1.1.1 and brotli 0.5.2, which are very old (the current versions are zstd 1.5.2 and brotli 1.0.9). The same author maintains lzbench [1], which is more or less up-to-date.
-
Project mention: q_compress 0.7: still has 35% higher compression ratio than .zstd.parquet for numerical sequences, now with delta encoding and 2x faster than before | reddit.com/r/rust | 2022-02-17
I'm the author of TurboPFor-Integer-Compression. Q_compress is a very interresting project, unfortunatelly it's difficult to compare it to other algorithms. There is not binary or test data files (with q_compress results) available for a simple benchmark. Speed comparison would also be helpfull.
-
p7zip
A new p7zip fork with additional codecs and improvements (forked from https://sourceforge.net/projects/sevenzip/ AND https://sourceforge.net/projects/p7zip/).
Project mention: 7-zip 22.00 – APFS, Posix TAR, high precision timestamps | news.ycombinator.com | 2022-06-23Thank you for pointing this out! This is the source of much confusion. Although Arch for example uses https://github.com/jinfeihan57/p7zip which seems to be reasonably maintained?
-
lizard
Lizard (formerly LZ5) is an efficient compressor with very fast decompression. It achieves compression ratio that is comparable to zip/zlib and zstd/brotli (at low and medium compression levels) at decompression speed of 1000 MB/s and faster. (by inikep)
Project mention: Lizard – efficient compression with fast decompression | news.ycombinator.com | 2022-05-24 -
bzip3 : https://github.com/kspalaiologos/bzip3
-
Have you considered ZSON? It requires you to compile a dictionary based on sample data - but given you're expecting most of your data to fit one of two variations this could be quite space efficient.
-
You need to install a few dependencies, notably the SquashFS and FUSE, and Snap itself:
-
-
Project mention: How to remove all columns except a select few with data.table? | reddit.com/r/Rlanguage | 2022-10-12
My initial rec was to use the fst package, but I read the other thread and you said it's not compatible. qs works in a similar way / with similar performances and should be compatible with any R version > 3.0.2, but I have personally never used it.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
C Compression related posts
- WASM compression benchmarks and the cost of missing compression APIs
- Framer Update: 2x Faster Sites
- Float Compression 3: Filters
- Micron Unveils 24GB and 48GB DDR5 Memory Modules | AMD EXPO and Intel XMP 3.0 compatible
- Weekly r/Tattoos Question/FreeTalk Thread! - January 14, 2023
- We're wasting money by only supporting gzip for raw DNA files
- Multiple tags with the same name in metadata
-
A note from our sponsor - InfluxDB
www.influxdata.com | 8 Feb 2023
Index
What are some of the best open-source Compression projects in C? This list will help you:
Project | Stars | |
---|---|---|
1 | zstd | 19,412 |
2 | brotli | 11,855 |
3 | LZ4 | 7,939 |
4 | ZLib | 4,172 |
5 | cute_headers | 3,714 |
6 | LZFSE | 1,698 |
7 | cstore_fdw | 1,696 |
8 | opus | 1,664 |
9 | zlib-ng | 1,170 |
10 | FiniteStateEntropy | 1,154 |
11 | zip | 1,093 |
12 | smaz | 1,040 |
13 | Minizip-ng | 980 |
14 | c-blosc | 869 |
15 | lzbench | 715 |
16 | TurboPFor | 651 |
17 | p7zip | 586 |
18 | lizard | 581 |
19 | bzip3 | 501 |
20 | zson | 483 |
21 | squashfs-tools | 441 |
22 | simdcomp | 411 |
23 | qs | 342 |