s3-stream-unzip
goofys
s3-stream-unzip | goofys | |
---|---|---|
8 | 16 | |
36 | 5,043 | |
- | - | |
10.0 | 0.0 | |
over 1 year ago | 3 months ago | |
Java | Go | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
s3-stream-unzip
- Show HN: Unzip files in S3 with Java, without memory or disk storage
-
Stream unzip files in S3 with Java
All that is left to do now is to integrate stream download, unzip, and multipart upload. I've done all the hard work and built nejckorasa/s3-stream-unzip.
- Java S3 stream unzip - unzip files in S3 without memory or disk storage
-
s3-stream-unzip
s3-stream-unzip is a Java utility that manages unzipping of data in AWS S3 utilising stream download and multipart upload.
- S3 stream unzip - unzip files in S3 without memory or disk storage
- Java S3 stream unzip - unzip files in S3
goofys
-
Is Posix Outdated?
The author needs to ask themselves: in this cloud technology stack, is there POSIX involved somewhere lower down, where I can't access it? The answer is, of course, "yes". The sort of cloud storage systems described all run on top of POSIX APIs. They provide convenience (cost efficiency is more debatable) compared to the POSIX alternative, but that's because they exist at an entirely different conceptual layer (hence the presence of POSIX anyway, just buried).
Your point about surfacing a POSIX that's actually there but hidden and thus visible to low-level Amazon employees building the S3 service which makes it invisible to S3 end customers is true but isn't the the point of the article. The author is saying there are motivations for a POSIX-like api visible also the end user.
So your explanation of stack looks like 2 layers: POSIX api <-- AWS S3 built on top of that
Author's essay is actually talking about 3 layers: POSIX <-- AWS S3 <-- POSIX
That's why the blog post has the following links to POSIX-on-top-of-S3-objects :
https://github.com/s3fs-fuse/s3fs-fuse
https://github.com/kahing/goofys
https://www.cuno.io/
-
AWS Announces Open Source Mountpoint for Amazon S3
How is this different than these other solutions?
https://github.com/kahing/goofys
https://github.com/s3fs-fuse/s3fs-fuse
-
Introducing Mountpoint for Amazon S3 - A file client that translates local file system API calls to S3 object API calls like GET and LIST.
But now I ask.. why not s3fs? Is it the GPL licensing? Or even goofys that also have Apache2 licensing and seems to hit similar goals (non fully POSIX compliant)? Why build your own?
- Merge my S3 with Mac Finder Folder
-
Migrating instance to AWS GovCloud
If your 20TB is in S3, use a staging box with goofys (https://github.com/kahing/goofys) to mount the commercial S3 bucket(s) into a folder, then use s3 sync to copy to your bucket(s) in GovCloud.
- How should I go about creating a program that holds various MP4 files?
- Raft Consensus Animated
-
How do you manage large training datasets?
So, we just need to change the dataloader function a bit to make this work then. Did you try just mounting S3 using https://github.com/kahing/goofys. In this case, we need not even change the dataloader code. Not sure of the performance though.
-
Mount S3 Objects to Kubernetes Pods
We're using goofys as the mounting utility. It's a "high-performance, POSIX-ish Amazon S3 file system written in Go" based on FUSE (file system in user space) technology.
-
What you gonna add to your selfhost stack this year?
will probably experiment with https://github.com/kahing/goofys and https://litestream.io/ to make services more easily moved between the devices :) Also, will continue working on https://synpse.net/ to make the operations easier.
What are some alternatives?
fluency - High throughput data ingestion logger to Fluentd, AWS S3 and Treasure Data
s3fs-fuse - FUSE-based file system backed by Amazon S3
DEFLATE-library-Java - Efficient DEFLATE compressor and decompressor in pure Java.
rclone - "rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
s3proxy - Access other storage backends via the S3 API
gcsfuse - A user-space file system for interacting with Google Cloud Storage
juicefs - JuiceFS is a distributed POSIX file system built on top of Redis and S3.
catfs - Cache AnyThing filesystem written in Rust
s3fs - S3 Filesystem
s3-proxy - S3 Reverse Proxy with GET, PUT and DELETE methods and authentication (OpenID Connect and Basic Auth)
s3fs - S3 FileSystem (fs.FS) implementation
goseaweedfs - A complete Golang client for SeaweedFS