minio vs zstd

minio

The Object Store for AI Data Infrastructure (by minio)

Source Code

min.io

Suggest alternative

Edit details

zstd

Zstandard - Fast real-time compression algorithm (by facebook)

Compression

Source Code

zstd.net

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

minio		zstd
	Project
100	Mentions	109
44,434	Stars	22,480
1.6%	Growth	1.7%
9.9	Activity	9.7
1 day ago	Latest Commit	1 day ago
Go	Language	C
GNU Affero General Public License v3.0	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

minio

Posts with mentions or reviews of minio. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-05.

JuiceFS 1.2 Beta 1: Gateway Upgrade, Enhanced Multi-User Permission Management
2 projects | dev.to | 5 May 2024

The core service of JuiceFS Gateway is to expose the POSIX file system via the S3 API. Before v1.2, we integrated the MinIO Gateway module into our code. This module was based on the Apache license. This integration made JuiceFS operations similar to MinIO's native command minio gateway NAS. For users, JuiceFS could be seen as an implementation of MinIO Gateway. MinIO's other backend implementations include NAS and Hadoop.
A Distributed File System in Go Cut Average Metadata Memory Usage to 100 Bytes
2 projects | news.ycombinator.com | 22 Feb 2024

Looks like minio added this in 2022:
https://github.com/minio/minio/pull/15433
Simulate multi-nodes configuration
1 project | /r/minio | 6 Dec 2023

We have this example of docker compose you can adapt to be larger https://github.com/minio/minio/blob/master/docs/orchestration/docker-compose/docker-compose.yaml
Ask HN: I have 10 yrs of Exp. Failed 4 takehome projects. What am I doing wrong?
7 projects | news.ycombinator.com | 11 Jul 2023

>Again, here you seem to be arguing against a strawman that doesn't know that blocking the IO loop is bad. Try arguing against one that knows ways to work around that. This is why I'm saying this rule isn't true. Extensive computation on single-threaded "scripting" languages is possible (and even if it wasn't, punt it off to a remote pool of workers, which could also be NodeJS!).
Very rare to find a rule that's absolutely true.. I clearly stated exceptions to the rule (which you repeated) but the generality is still true.
Threading in nodejs is new and didn't exist since the last time I touched it. It looks like it's not the standard use case as google searches still have websites with titles saying node is single threaded everywhere. The only way I can see this being done is multiple Processes (meaning each with a copy of v8) using OS shared memory as IPC and they're just calling it threads. It will take a shit load of work to make v8 actually multi-threaded.
Processes are expensive so you can't really follow this model per request. And we stopped following threading per request over a decade ago.
Again these are exceptions to the rule, from what I'm reading Nodejs is normally still single threaded with a fixed number of worker processes that are called "threads". Under this my general rule is still generally true: backend engineering does no typically involve writing non blocking code and offloading compute to other sources. Again, there are exceptions but as I stated before these exceptions are rare.
>Here's what I mean -- you can actually solve the ordering problem in O(N) + O(M) time by keeping track of the max you've seen and building a sparse array and running through every single index from max to zero. It's overkill, but it's generally referred to as a counting sort:
Oh come on. We both know these sorts won't work. These large numbers will throw off memory. Imagine 3 routes. One route gets 352 hits, another route gets 400 hits, and another route gets 600,000 hits. What's Big Oh for memory and sort?
It's O(600,000) for both memory and runtime. N=3 and it doesn't even matter here. Yeah these types of sorts are almost never used for this reason, they only work for things with smaller ranges. It's also especially not useful for this project. Like this project was designed so "counting sort" fails big time.
Also we don't need to talk about the O(N) read and write. That's a given it's always there.
>I don't think these statements make sense -- having docker installed and having redis installed are basically equivalent work. At the end of the day, the outcome is the same -- the developer is capable of running redis locally. Having redis installed on your local machine is absolutely within range for a backend developer.
Unfortunately these statements do make sense and your characterization seems completely dishonest to me. People like to keep their local environments pure and segregated away from daemons that run in a web server. I'm sure in your universe you are claiming web developers install redis, postgresql and kafka all locally but that just sounds absurd to me. We can agree to disagree but from my perspective I don't think you're being realistic here.
>Also, remote development is not practiced by many companies -- the only companies I've seen doing thin-clients that are large.
It's practiced by a large amount and basically every company I've worked at for the past 5 years. Every company has to at least partially do remote dev in order to fully test E2E stuff or integrations.
>I see it as just spinning up docker, not compose -- you already have access to the app (ex. if it was buildable via a function) so you could spawn redis in a subprocess (or container) on a random port, and then spawn the app.
Sure. The point is it's hacky to do this without an existing framework. I'll check out that library you linked.
>I agree that integration testing is harder -- I think there's more value there.
Of course there's more value. You get more value at higher cost. That's been my entire point.
>Also, for replicating S3, minio (https://github.com/minio/minio) is a good stand-in. For replicating lambda, localstack (https://docs.localstack.cloud/user-guide/aws/lambda/) is probably reasonable there's also frameworks with some consideration for this (https://www.serverless.com/framework/docs/providers/aws/guid...) built in.
Good finds. But what about SNS, IOT, Big Query and Redshift? Again my problem isn't about specific services, it's about infra in general.
>Ah, this is true -- but I think this is what people are testing in interviews. There is a predominant culture/shared values, and the test is literally whether someone can fit into those values.
No. I think what's going on is people aren't putting much thought into what they're actually interviewing for. They just have some made up bar in their mind whether it's a leetcode algorithm or whether the guy wrote a unit test for the one available pure function for testing.
>Whether they should or should not be, that's at least partially what interviews are -- does the new team member feel the same way about technical culture currently shared by the team.
The answer is no. There's always developers who disagree with things and just don't reveal it. Think about the places you worked at. Were you in total agreement? I doubt it. A huge amount of devs are opinionated and think company policies or practices are BS. People adapt.
>Now in the case of this interview your solution was just fine, even excellent (because you went out of your way to do async io, use newer/easier packaging methodologies, etc), but it's clearly not just that.
The testing is just a game. I can play the game and suddenly I pass all the interviews. I think this is the flaw with your methodology as I just need to write tests to get in. Google for example in spirit attempted another method which involves testing IQ via algorithms. It's a much higher bar
The problem with google is that their methodology can also be gamed but it's much harder to game it and often the bar is too high for the actual job the engineer is expected to do.
I think both methodologies are flawed, but hiring via ignoring raw ability and picking people based off of weirdly specific cultural preferences is the worse of the two hiring methodologies.
Put it this way. If a company has a strong testing culture, then engineers who don't typically test things will adapt. It's not hard to do, and testing isn't so annoying that they won't do it.
Unable to configure a MinIO cluster, pls help
1 project | /r/selfhosted | 30 Jun 2023

The answer is here https://github.com/minio/minio/discussions/17543

1 project | /r/minio | 30 Jun 2023

You've already helped me here https://github.com/minio/minio/discussions/17543. Thank you very much once more.
What's the best AWS S3 protocol alternative?
3 projects | news.ycombinator.com | 31 May 2023

You say protocol alternative, but assuming you're more concerned with AWS as the host than S3 as the protocol you might try https://github.com/minio/minio
If you do feel an aversion to the protocol then the rclone backend list would be a good starting point
https://rclone.org/overview/
proper content delivery (images etc)
1 project | /r/sysadmin | 25 May 2023

Seems like you want object storage. S3 would be the goto suggestion here, but you said it needs to run on prem so perhaps MinIO.
Reason to use other Build Tool than Make?
9 projects | /r/golang | 19 May 2023

You could refer to big OSS project Makefiles to take a look, what could be there, for example: https://github.com/minio/minio/blob/master/Makefile
Looking for a Backblaze B2 compatible cloud backup application for Linux that uses standard file level (not block level) ZIP encryption (and with GUI would be nice).
3 projects | /r/DataHoarder | 16 May 2023

Backblaze's B2 is compatible with AWS S3 that also implemented in selfhosted minio

zstd

Posts with mentions or reviews of zstd. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-07.

Rethinking string encoding: a 37.5% space efficient encoding than UTF-8 in Fury
2 projects | news.ycombinator.com | 7 May 2024

> In such cases, the serialized binary are mostly in 200~1000 bytes. Not big enough for zstd to work
You're not referring to the same dictionary that I am. Look at --train in [1].
If you have a training corpus of representative data, you can generate a dictionary that you preshare on both sides which will perform much better for very small binaries (including 200-1k bytes).
If you want maximum flexibility (i.e. you don't know the universe of representative messages ahead of time or you want maximum compression performance), you can gather this corpus transparently as messages are generated & then generate a dictionary & attach it as sideband metadata to a message. You'll probably need to defer the decoding if it references a dictionary not yet received (i.e. send delivers messages out-of-order from generation). There are other techniques you can apply, but the general rule is that your custom encoding scheme is unlikely to outperform zstd + a representative training corpus. If it does, you'd need to actually show this rather than try to argue from first principles.
[1] https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md
Drink Me: (Ab)Using a LLM to Compress Text
2 projects | news.ycombinator.com | 4 May 2024

> Doesn't take large amount of GPU resources
This is an understatement, zstd dictionary compression and decompression are blazingly fast: https://github.com/facebook/zstd/blob/dev/README.md#the-case...
My real-world use case for this was JSON files in a particular schema, and the results were fantastic.
SQLite VFS for ZSTD seekable format
2 projects | news.ycombinator.com | 26 Apr 2024

This VFS will read a sqlite file after it has been compressed using [zstd seekable format](https://github.com/facebook/zstd/blob/dev/contrib/seekable_f...). Built to support read-only databases for full-text search. Benchmarks are provided in README.
Chrome Feature: ZSTD Content-Encoding
10 projects | news.ycombinator.com | 1 Apr 2024

Of course, you may get different results with another dataset.
gzip (zlib -6) [ratio=32%] [compr=35Mo/s] [dec=407Mo/s]
zstd (zstd -2) [ratio=32%] [compr=356Mo/s] [dec=1067Mo/s]
NB1: The default for zstd is -3, but the table only had -2. The difference is probably small. The range is 1-22 for zstd and 1-9 for gzip.
NB2: The default program for gzip (at least with Debian) is the executable from zlib. With my workflows, libdeflate-gzip iscompatible and noticably faster.
NB3: This benchmark is 2 years old. The latest releases of zstd are much better, see https://github.com/facebook/zstd/releases
For a high compression, according to this benchmark xz can do slightly better, if you're willing to pay a 10× penalty on decompression.
xz -9 [ratio=23%] [compr=2.6Mo/s] [dec=88Mo/s]
zstd -9 [ratio=23%] [compr=2.6Mo/s] [dec=88Mo/s]
Zstandard v1.5.6 – Chrome Edition
1 project | news.ycombinator.com | 26 Mar 2024
Optimizating Rabin-Karp Hashing
1 project | news.ycombinator.com | 9 Mar 2024

Compression, synchronization and backup systems often use rolling hash to implement "content-defined chunking", an effective form of deduplication.
In optimized implementations, Rabin-Karp is likely to be the bottleneck. See for instance https://github.com/facebook/zstd/pull/2483 which replaces a Rabin-Karp variant by a >2x faster Gear-Hashing.
Show HN: macOS-cross-compiler – Compile binaries for macOS on Linux
7 projects | news.ycombinator.com | 17 Feb 2024
Cyberpunk 2077 dev release
1 project | /r/gamedev | 11 Dec 2023

Get the data https://publicdistst.blob.core.windows.net/data/root.tar.zst magnet:?xt=urn:btih:84931cd80409ba6331f2fcfbe64ba64d4381aec5&dn=root.tar.zst How to extract https://github.com/facebook/zstd Linux (debian): `sudo apt install zstd` ``` tar -I 'zstd -d -T0' -xvf root.tar.zst ```
Honey, I shrunk the NPM package · Jamie Magee
1 project | news.ycombinator.com | 3 Oct 2023

I've done that experiment with zstd before.
https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md...
Not sure about brotli though.
How in the world should we unpack archive.org zst files on Windows?
2 projects | /r/Archiveteam | 24 May 2023

If you want this functionality in zstd itself, check this out: https://github.com/facebook/zstd/pull/2349

What are some alternatives?

When comparing minio and zstd you can also consider the following projects:

Nextcloud - ☁️ Nextcloud server, a safe home for all your data

LZ4 - Extremely Fast Compression algorithm

Seaweed File System - SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. [Moved to: https://github.com/seaweedfs/seaweedfs]

Snappy - A fast compressor/decompressor

GlusterFS - Gluster Filesystem : Build your distributed storage in minutes

LZMA - (Unofficial) Git mirror of LZMA SDK releases

Samba - https://gitlab.com/samba-team/samba is the Official GitLab mirror of https://git.samba.org/samba.git -- Merge requests should be made on GitLab (not on GitHub)

7-Zip-zstd - 7-Zip with support for Brotli, Fast-LZMA2, Lizard, LZ4, LZ5 and Zstandard

seaweedfs - SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

ZLib - A massively spiffy yet delicately unobtrusive compression library.

Swift - OpenStack Storage (Swift). Mirror of code maintained at opendev.org.

brotli - Brotli compression format

minio vs Nextcloud zstd vs LZ4 minio vs Seaweed File System zstd vs Snappy minio vs GlusterFS zstd vs LZMA minio vs Samba zstd vs 7-Zip-zstd minio vs seaweedfs zstd vs ZLib minio vs Swift zstd vs brotli

Compare minio vs zstd and see what are their differences.

minio

zstd

minio

zstd

What are some alternatives?