manta
juicefs
Our great sponsors
manta | juicefs | |
---|---|---|
5 | 42 | |
603 | 9,824 | |
0.7% | 2.8% | |
3.5 | 9.8 | |
25 days ago | about 15 hours ago | |
Makefile | Go | |
Mozilla Public License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
manta
-
Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)
These posts always remind me of the [Manta Object Storage](https://www.tritondatacenter.com/triton/object-storage) project by Joyent. This project was basically a combination of object storage with the added ability to run arbitrary programs against your data in situ. The primary, and key, difference being that you kept the data in place and distributed the program to the data storage nodes (the opposite of most data processing as I understand it), I think of this as a superpowered version of using [pssh](https://linux.die.net/man/1/pssh) to grep logs across a datacenter. Yet another idea before its time. Luckily, Joyent [open sourced](https://github.com/TritonDataCenter/manta) the work, but the fact that it still hasn't caught on as "The Way" is telling.
Some of the projects I remember from the Joyent team were: dumping recordings of local mariokart games to manta and running analytics on the raw video to generate office kart racer stats, the bog standard dump all the logs and map/reduce/grep/count them, and I think there was one about running mdb postmortems on terabytes of core dumps.
-
An open-source distributed object storage service
Are you sure this offers the same S3 compatible API? It sure does look like it rolled its own API[1], which I guess is fine so long as you're entirely in the Triton ecosystem, but makes reusing existing software harder than necessary without that compatibility layer. And that's not even getting into this absolutely mess: https://github.com/TritonDataCenter/manta#repositories it reminds me of the "Microservices" video come to life
1: https://github.com/TritonDataCenter/manta/blob/master/docs/u...
-
Oxide at Home: Propolis Says Hello
This is great information, and we've had similar experiences.
I'm also looking forward to further testing LinuxCN (https://github.com/joyent/linux-live/tree/linuxcn) on Triton in the near future!
Are you running Manta (https://github.com/joyent/manta) for anything? If so, is that meeting you needs for object storage?
-
Command-line Tools can be 235x Faster than your Hadoop Cluster – Adam Drake
Joyent's Manta took this concept to the extreme.
Previous discussion here: https://news.ycombinator.com/item?id=5939340
Image manipulation example: https://www.joyent.com/blog/joyent-manta-storage-service-ima...
Manta repo an GitHub: https://github.com/joyent/manta
juicefs
-
South Korea's No.1 Search Engine Chose JuiceFS over Alluxio for AI Storage
Support for Kerberos keytab files
-
5 Open Source tools written in Golang that you should know about
JuiceFS under the Apache License 2.0, is a high-performance POSIX file system optimized for cloud-native environments. It stores data in Object Storage (e.g., Amazon S3) and metadata in databases like Redis, MySQL, or TiKV. JuiceFS integrates massive cloud storage with big data, machine learning, and AI applications efficiently, akin to local storage. It features full POSIX and Hadoop compatibility, S3 interface, Kubernetes support, and shared file storage for numerous clients. Some cool features are - strong consistency, scalable performance, data encryption, global file locks, and compression with LZ4 or Zstandard.
-
How to Build a Ceph Cluster and Integrate with the JuiceFS File System
To improve the handling process of capacity overrun, the JuiceFS client supports deletion operations in the case of Ceph cluster fullness (see related code changes in JuiceFS Community Edition). Therefore, for newer client versions, there is no need to use set-full-ratio for temporary adjustments.
-
A Deep Dive into the Design of Directory Quotas in JuiceFS
If you have any questions or would like to learn more, feel free to join discussions about JuiceFS on GitHub and the JuiceFS community on Slack.
- JuiceFS 1.1 - Distributed File System written in Go
-
Gcsfuse: A user-space file system for interacting with Google Cloud Storage
The architecture image shows GCS and others, so I suspect it does.
https://github.com/juicedata/juicefs#architecture
-
Google Cloud Storage FUSE
See also: JuiceFS: https://juicefs.com/
Adds a DBMS or key-value store for metadata, making the filesystem much faster (POSIX, small overwrites don't have to replace a full object in the GCS/S3 backend).
Almost certainly a better solution if you want to turn your object storage into a mountable filesystem, with the (big) caveat that you can't access the files directly in the bucket (they are not stored transparently).
- Using S3 as shared storage
-
s3fs-fuse VS juicefs - a user suggested alternative
2 projects | 19 Feb 2023
JuiceFS can do the same thing as s3fs-fuse, but better. Because it supports robust data consistency and caching policies to improve performance.
- JuiceFS: Turn Cloud Blob Storage into Local Posix Filesystems
What are some alternatives?
oxide-and-friends - Show notes from Oxide and Friends recordings
cubefs - cloud-native file store
linux-live - Linux compute node platform image tools. This is the Linux counterpart to smartos-live.
goofys - a high-performance, POSIX-ish Amazon S3 file system written in Go
Canvas LMS - The open LMS by Instructure, Inc.
s3-benchmark - Measure Amazon S3's performance from any location.
Seaweed File System - SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. [Moved to: https://github.com/seaweedfs/seaweedfs]
gcsfuse - A user-space file system for interacting with Google Cloud Storage
seaweedfs - SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
Golang-PDF-to-Image-Converter - This project will help you to convert PDF file to IMAGE using golang.
riak_cs - Riak CS is simple, available cloud storage built on Riak.
hdfs - A native go client for HDFS