soci-snapshotter vs zstd

soci-snapshotter

A containerd snapshotter plugin which enables standard OCI images to be lazily loaded without requiring a build-time conversion step. (by awslabs)

Suggest topics

Source Code

Suggest alternative

Edit details

zstd

Zstandard - Fast real-time compression algorithm (by facebook)

Compression

Source Code

zstd.net

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

soci-snapshotter		zstd
	Project
6	Mentions	109
470	Stars	22,480
4.0%	Growth	1.7%
9.2	Activity	9.7
2 days ago	Latest Commit	4 days ago
Go	Language	C
Apache License 2.0	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

soci-snapshotter

Posts with mentions or reviews of soci-snapshotter. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-10.

Seekable OCI - Lazy Loading Container Images on ECS and Fargate
1 project | dev.to | 18 Oct 2023
Kubernetes SidecarContainers feature is merged
7 projects | news.ycombinator.com | 10 Jul 2023

So I can give some behind the scenes insight on that. I don't think image caching will be a thing in the way people are explicitly asking, but we are exploring some alternative approaches to speeding up container launch that we think will actually be even more effective than what people are asking for.
First of all we want to leverage some of the learnings from AWS Lambda, in specific some of the research we've done that shows that about 75% of container images only contain 5% unique bytes (https://brooker.co.za/blog/2023/05/23/snapshot-loading.html). This makes deduplication incredibly effective, and allows the deployment of a smart cache that holds the 95% of popular recurring files and file chunks from container images, while letting the unique 5% be loaded over the network. There will be outliers of course, but if you base your image off a well used base image then it will already be in the cache. This is partially implemented. You will notice that if you use certain base images your Fargate container seems to start a bit faster. (Unfortunately we do not really publish this list or commit to what base images are in the cache at this time).
In another step along this path we are working on SOCI Snapshotter (https://github.com/awslabs/soci-snapshotter) forked off of Stargz Snapshotter. This allows a container image to have an attached index file that actually allows it to start up before all the contents are downloaded, and lazy load in remaining chunks of the image as needed. This takes advantage of another aspect of container images which is that many of them don't actually use all of the bytes in the image anyway.
Over time we want to make these two pieces (deduplication and lazy loading) completely behind the scenes so you just upload your image to Elastic Container Registry and AWS Fargate seems to magically start your image dramatically faster than you could locally if downloading the image from scratch.
What is a containerd snapshotter?
2 projects | dev.to | 14 Feb 2023

This behavior allows us to set up snapshots out of band, that is outside the "normal" workflow. One such example is the soci snapshotter which allows image lazy loading. The snapshotter ships with a "rpull" command which performs this out of band prep. During rpull, the command calls the soci snapshotter which creates fuse mounts for each layer that has an index (remote snapshot). For layers that do not have an index, it downloads them as usual (local snapshot). The local snapshot is created with overlay mount. Anyway, this is just a detail, the important bit is that folders created for a local snapshot only contain that layer, which is exactly what overlay does. For example, after rpull we'll see something like the following:
A Hidden Gem: Two Ways to Improve AWS Fargate Container Launch Times
3 projects | dev.to | 27 Oct 2022

Seekable OCI (SOCI) is a technology open-sourced by AWS that enables containers to launch faster by lazily loading the container image. It’s usually not possible to fetch individual files from gzipped tar files. With SOCI, AWS borrowed some of the design principles from stargz-snapshotter, but took a different approach. A SOCI index is generated separately from the container image and is stored in the registry as an OCI Artifact and linked back to the container image by OCI Reference Types. This means that the container images do not need to be converted, image digests do not change, and image signatures remain valid.
GitHub - awslabs/soci-snapshotter
1 project | /r/devopsish | 9 Sep 2022
awslabs/soci-snapshotter
1 project | /r/devopsish | 9 Sep 2022

zstd

Posts with mentions or reviews of zstd. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-07.

Rethinking string encoding: a 37.5% space efficient encoding than UTF-8 in Fury
2 projects | news.ycombinator.com | 7 May 2024

> In such cases, the serialized binary are mostly in 200~1000 bytes. Not big enough for zstd to work
You're not referring to the same dictionary that I am. Look at --train in [1].
If you have a training corpus of representative data, you can generate a dictionary that you preshare on both sides which will perform much better for very small binaries (including 200-1k bytes).
If you want maximum flexibility (i.e. you don't know the universe of representative messages ahead of time or you want maximum compression performance), you can gather this corpus transparently as messages are generated & then generate a dictionary & attach it as sideband metadata to a message. You'll probably need to defer the decoding if it references a dictionary not yet received (i.e. send delivers messages out-of-order from generation). There are other techniques you can apply, but the general rule is that your custom encoding scheme is unlikely to outperform zstd + a representative training corpus. If it does, you'd need to actually show this rather than try to argue from first principles.
[1] https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md
Drink Me: (Ab)Using a LLM to Compress Text
2 projects | news.ycombinator.com | 4 May 2024

> Doesn't take large amount of GPU resources
This is an understatement, zstd dictionary compression and decompression are blazingly fast: https://github.com/facebook/zstd/blob/dev/README.md#the-case...
My real-world use case for this was JSON files in a particular schema, and the results were fantastic.
SQLite VFS for ZSTD seekable format
2 projects | news.ycombinator.com | 26 Apr 2024

This VFS will read a sqlite file after it has been compressed using [zstd seekable format](https://github.com/facebook/zstd/blob/dev/contrib/seekable_f...). Built to support read-only databases for full-text search. Benchmarks are provided in README.
Chrome Feature: ZSTD Content-Encoding
10 projects | news.ycombinator.com | 1 Apr 2024

Of course, you may get different results with another dataset.
gzip (zlib -6) [ratio=32%] [compr=35Mo/s] [dec=407Mo/s]
zstd (zstd -2) [ratio=32%] [compr=356Mo/s] [dec=1067Mo/s]
NB1: The default for zstd is -3, but the table only had -2. The difference is probably small. The range is 1-22 for zstd and 1-9 for gzip.
NB2: The default program for gzip (at least with Debian) is the executable from zlib. With my workflows, libdeflate-gzip iscompatible and noticably faster.
NB3: This benchmark is 2 years old. The latest releases of zstd are much better, see https://github.com/facebook/zstd/releases
For a high compression, according to this benchmark xz can do slightly better, if you're willing to pay a 10× penalty on decompression.
xz -9 [ratio=23%] [compr=2.6Mo/s] [dec=88Mo/s]
zstd -9 [ratio=23%] [compr=2.6Mo/s] [dec=88Mo/s]
Zstandard v1.5.6 – Chrome Edition
1 project | news.ycombinator.com | 26 Mar 2024
Optimizating Rabin-Karp Hashing
1 project | news.ycombinator.com | 9 Mar 2024

Compression, synchronization and backup systems often use rolling hash to implement "content-defined chunking", an effective form of deduplication.
In optimized implementations, Rabin-Karp is likely to be the bottleneck. See for instance https://github.com/facebook/zstd/pull/2483 which replaces a Rabin-Karp variant by a >2x faster Gear-Hashing.
Show HN: macOS-cross-compiler – Compile binaries for macOS on Linux
7 projects | news.ycombinator.com | 17 Feb 2024
Cyberpunk 2077 dev release
1 project | /r/gamedev | 11 Dec 2023

Get the data https://publicdistst.blob.core.windows.net/data/root.tar.zst magnet:?xt=urn:btih:84931cd80409ba6331f2fcfbe64ba64d4381aec5&dn=root.tar.zst How to extract https://github.com/facebook/zstd Linux (debian): `sudo apt install zstd` ``` tar -I 'zstd -d -T0' -xvf root.tar.zst ```
Honey, I shrunk the NPM package · Jamie Magee
1 project | news.ycombinator.com | 3 Oct 2023

I've done that experiment with zstd before.
https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md...
Not sure about brotli though.
How in the world should we unpack archive.org zst files on Windows?
2 projects | /r/Archiveteam | 24 May 2023

If you want this functionality in zstd itself, check this out: https://github.com/facebook/zstd/pull/2349

What are some alternatives?

When comparing soci-snapshotter and zstd you can also consider the following projects:

stargz-snapshotter - Fast container image distribution plugin with lazy pulling

LZ4 - Extremely Fast Compression algorithm

cloudsql-proxy - A utility for connecting securely to your Cloud SQL instances [Moved to: https://github.com/GoogleCloudPlatform/cloud-sql-proxy]

Snappy - A fast compressor/decompressor

enhancements - Enhancements tracking repo for Kubernetes

LZMA - (Unofficial) Git mirror of LZMA SDK releases

containers-roadmap - This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).

7-Zip-zstd - 7-Zip with support for Brotli, Fast-LZMA2, Lizard, LZ4, LZ5 and Zstandard

kubernetes - Production-Grade Container Scheduling and Management

ZLib - A massively spiffy yet delicately unobtrusive compression library.

brotli - Brotli compression format

haproxy - HAProxy Load Balancer's development branch (mirror of git.haproxy.org)

soci-snapshotter vs stargz-snapshotter zstd vs LZ4 soci-snapshotter vs cloudsql-proxy zstd vs Snappy soci-snapshotter vs enhancements zstd vs LZMA soci-snapshotter vs containers-roadmap zstd vs 7-Zip-zstd soci-snapshotter vs kubernetes zstd vs ZLib zstd vs brotli zstd vs haproxy

Compare soci-snapshotter vs zstd and see what are their differences.

soci-snapshotter

zstd

soci-snapshotter

zstd

What are some alternatives?