casync vs zstd

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

casync		zstd
	Project
17	Mentions	105
1,461	Stars	22,293
0.7%	Growth	1.9%
2.4	Activity	9.6
4 months ago	Latest Commit	8 days ago
C	Language	C
-	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

casync

Posts with mentions or reviews of casync. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-10.

Tool to clone file structure without the large files themselves?
2 projects | /r/commandline | 10 Jan 2023

You probably want casync.
LibSQL – a fork of SQLite that is both Open Source, and Open Contributions
2 projects | /r/opensource | 4 Oct 2022

(personally, I think more people need to be aware of casync for the update storage/distribution problem. It isn't perfect for every use case, but it's good enough that you're probably better off wrapping/forking it rather than reimplementing it badly from scratch)
improving download infra
3 projects | /r/openSUSE | 16 Sep 2022

Does something like casync (https://github.com/systemd/casync or https://github.com/folbricht/desync) serve any purpose or provide any advantage to propagating rpm changes over rsync?
Are there any true alternatives to Seafile? (Nextcloud is not an alternative in this context)
5 projects | /r/selfhosted | 28 Aug 2022

Software that comes to mind for syncing lots of small files: git (and other source versioning tools), casync (https://github.com/systemd/casync) and a go implementation (https://github.com/folbricht/desync). Not really an answer and I can't think of a way to shoehorn that into your workflow, but maybe it leads you down a useful road.
Hacker News top posts: Apr 23, 2022
2 projects | /r/hackerdigest | 23 Apr 2022

Casync – A Content-Addressable Data Synchronization Tool\ (15 comments)
Casync – A Content-Addressable Data Synchronization Tool
10 projects | news.ycombinator.com | 22 Apr 2022

I was wondering how this gets any common chunks at all with the removed file boundaries. Turns out that chunks don't have a set size, just min/max/avg values, so unaligned streams may end up synchronizing. https://github.com/systemd/casync/blob/master/src/cachunker.... If I understood that correctly, that's pretty cool.
But looking at the code I'm having strong "nope" feelings. First, because of lines like "q += m, n -= m;". Second, because of int/enum/semantic abuse: `compression_type` may be _CA_COMPRESSION_TYPE_INVALID which I hope is 0, `>= 0` as a known compression type, or `-EAGAIN` as an error. (from https://github.com/systemd/casync/blob/99559cd1d8cea69b30022... ) I'd bet that just throwing afl at the decompressor will find issues :(
I do like the idea though.

10 projects | news.ycombinator.com | 22 Apr 2022
Blobcache is a content addressed data store, designed to be a replicated data layer for applications.
2 projects | /r/coding | 10 Apr 2022

Compare https://github.com/systemd/casync which handles splitting/diffing, but does not handle fancy replication.
Deduplicating Archiver with Compression and Encryption
19 projects | news.ycombinator.com | 24 Jul 2021

zstd

Posts with mentions or reviews of zstd. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-01.

Chrome Feature: ZSTD Content-Encoding
10 projects | news.ycombinator.com | 1 Apr 2024

Citation needed? https://github.com/facebook/zstd/commits/dev/?author=jiaT75

10 projects | news.ycombinator.com | 1 Apr 2024

Of course, you may get different results with another dataset.
gzip (zlib -6) [ratio=32%] [compr=35Mo/s] [dec=407Mo/s]
zstd (zstd -2) [ratio=32%] [compr=356Mo/s] [dec=1067Mo/s]
NB1: The default for zstd is -3, but the table only had -2. The difference is probably small. The range is 1-22 for zstd and 1-9 for gzip.
NB2: The default program for gzip (at least with Debian) is the executable from zlib. With my workflows, libdeflate-gzip iscompatible and noticably faster.
NB3: This benchmark is 2 years old. The latest releases of zstd are much better, see https://github.com/facebook/zstd/releases
For a high compression, according to this benchmark xz can do slightly better, if you're willing to pay a 10× penalty on decompression.
xz -9 [ratio=23%] [compr=2.6Mo/s] [dec=88Mo/s]
zstd -9 [ratio=23%] [compr=2.6Mo/s] [dec=88Mo/s]

10 projects | news.ycombinator.com | 1 Apr 2024

There is an issue tracking this with a bunch of links to discussions about it, but they continue to not have time it seems.
https://github.com/facebook/zstd/issues/3100
This was the first place my mind went when I saw this Content-Encoding announcement, so I ran and re-checked the issue :(.

10 projects | news.ycombinator.com | 1 Apr 2024

Yes, but they also work for megacorp Facebook, and according to https://github.com/facebook/zstd/graphs/contributors, 300+ other contributors have made 4500+ commits to the zstd repo.
It's not quite as small-scale as 90's-style shareware was.
Show HN: macOS-cross-compiler – Compile binaries for macOS on Linux
7 projects | news.ycombinator.com | 17 Feb 2024
How in the world should we unpack archive.org zst files on Windows?
2 projects | /r/Archiveteam | 24 May 2023

If you want this functionality in zstd itself, check this out: https://github.com/facebook/zstd/pull/2349

2 projects | /r/Archiveteam | 24 May 2023
ZSTD 1.5.5 is released with a corruption fix found at Google
3 projects | news.ycombinator.com | 4 Apr 2023
Float Compression 3: Filters
3 projects | news.ycombinator.com | 1 Feb 2023

Interesting to match with the observations from the practice of using ClickHouse[1][2] for time series:
1. Reordering to SOA helps a lot - this is the whole point of column-oriented databases.
2. Specialized codecs like Gorilla[3], DoubleDelta[4], and FPC[5] lose to simply using ZSTD[6] compression in most cases, both in compression ratio and in performance.
3. Specialized time-series DBMS like InfluxDB or TimescaleDB lose to general-purpose relational OLAP DBMS like ClickHouse [7][8][9].
[1] https://clickhouse.com/blog/optimize-clickhouse-codecs-compr...
[2] https://github.com/ClickHouse/ClickHouse
[3] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[4] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[5] https://clickhouse.com/docs/en/sql-reference/statements/crea...
[6] https://github.com/facebook/zstd/
[7] https://arxiv.org/pdf/2204.09795.pdf "SciTS: A Benchmark for Time-Series Databases in Scientific Experiments and Industrial Internet of Things" (2022)
[8] https://gitlab.com/gitlab-org/incubation-engineering/apm/apm... https://gitlab.com/gitlab-org/incubation-engineering/apm/apm...
[9] https://www.sciencedirect.com/science/article/pii/S187705091...
We're wasting money by only supporting gzip for raw DNA files
6 projects | news.ycombinator.com | 9 Jan 2023

zstd has a long range mode, which lets it find redundancies a gigabyte away. Try --long and --long=31 for very long range mode.
zstd has delta / patch mode, which creates a file that stores the "patch" to create a new file from an old (reference) file. See https://github.com/facebook/zstd/wiki/Zstandard-as-a-patchin...
See the man page: https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md

What are some alternatives?

When comparing casync and zstd you can also consider the following projects:

LZ4 - Extremely Fast Compression algorithm

Snappy - A fast compressor/decompressor

LZMA - (Unofficial) Git mirror of LZMA SDK releases

7-Zip-zstd - 7-Zip with support for Brotli, Fast-LZMA2, Lizard, LZ4, LZ5 and Zstandard

ZLib - A massively spiffy yet delicately unobtrusive compression library.

brotli - Brotli compression format

haproxy - HAProxy Load Balancer's development branch (mirror of git.haproxy.org)

LZFSE - LZFSE compression library and command line tool

zlib-ng - zlib replacement with optimizations for "next generation" systems.

zlib - Cloudflare fork of zlib with massive performance improvements

zfs - OpenZFS on Linux and FreeBSD

LZHAM - Lossless data compression codec with LZMA-like ratios but 1.5x-8x faster decompression speed, C/C++

zstd vs LZ4 zstd vs Snappy zstd vs LZMA zstd vs 7-Zip-zstd zstd vs ZLib zstd vs brotli zstd vs haproxy zstd vs LZFSE zstd vs zlib-ng zstd vs zlib zstd vs zfs zstd vs LZHAM

Compare casync vs zstd and see what are their differences.

casync

zstd

casync

zstd

What are some alternatives?