SaaSHub helps you find the best software and product alternatives Learn more →
Top 13 HDFS Open-Source Projects
-
seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
rumble
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more (by RumbleDB)
-
elbencho
A distributed storage benchmark for file systems, object stores & block devices with support for GPUs
Project mention: DwarFS – The Deduplicating Warp-Speed Advanced Read-Only File System | news.ycombinator.com | 2024-04-11Whoops: WebDAV:
https://news.ycombinator.com/item?id=39417503
SeaweedFS supports WebDAV. https://github.com/seaweedfs/seaweedfs/wiki/WebDAV
I'm not able to find if both/restic supports mounting backups as WebDAV, but in theory there's nothing stopping you.
It's 100% user space (expose a rest service) and supported by a bunch of file-browsers with a bit of a network aware component to it as well.
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadmchmod a+x cephadm./cephadm bootstrap --mon-ip 192.168.1.41
Project mention: South Korea's No.1 Search Engine Chose JuiceFS over Alluxio for AI Storage | dev.to | 2024-01-18Support for Kerberos keytab files
TileDB, Inc. | Full-Time | REMOTE | USA, Greece/EU | [https://tiledb.com](https://tiledb.com/)
TileDB has recently announced a $34 million Series B fund-raise and is actively hiring for engineers across a range of roles (SRE, backend/distributed systems, database internals, and more). You will have the opportunity to work on innovative technology that creates impact for challenging problems in genomics, geospatial, machine learning, distributed systems, and many other areas.
TileDB Cloud is the modern database, allowing developers and scientists to capture, analyze, and share any data with any tool. We build on a broad foundation of open source, maintaining the TileDB storage engine, libraries for genomics (single-cell and population), geospatial (raster, point clouds, and more), a TileDB visualization engine extending Babylon.js, and much more ([github.com/TileDB-Inc/TileDB](http://github.com/TileDB-Inc/TileDB))
With TileDB, all data — tables, genomics, images, videos, location, time-series — is captured as multi-dimensional arrays. To supercharge this data, TileDB Cloud implements a serverless infrastructure delivering query execution, access control, data and code sharing, and distributed computing at global scale — eliminating cluster management, minimizing TCO, and promoting scientific collaboration and reproducibility.
Website: [https://tiledb.com](https://tiledb.com/) | GitHub: https://github.com/TileDB-Inc/TileDB | Blog: https://tiledb.com/blog
We are actively hiring for several roles including:
- Site Reliability Engineer (k8s, Terraform, automation, Prometheus, CloudWatch, GitOps; Golang, Python)
HDFS related posts
-
DuckDB + dbt for a serverless event correlation pipeline?
-
SeaweedFS
-
Experience running rook-ceph in production/large clusters
-
First Homelab as a 19yr old Software Developer
-
SeaweedFS is a fast distributed storage system for blobs, objects and files
-
pandas 2.0 and the Arrow revolution (part I)
-
SeaweedFS vs JuiceFS
-
A note from our sponsor - SaaSHub
www.saashub.com | 4 May 2024
Index
What are some of the best open-source HDFS projects? This list will help you:
Project | Stars | |
---|---|---|
1 | seaweedfs | 21,123 |
2 | Ceph | 13,259 |
3 | juicefs | 9,824 |
4 | smart_open | 3,093 |
5 | TileDB | 1,764 |
6 | hdfs | 1,347 |
7 | kafka-connect-ui | 496 |
8 | rumble | 207 |
9 | TileDB-Py | 179 |
10 | elbencho | 147 |
11 | apache-spark-docker | 40 |
12 | hdfs-rs | 33 |
13 | NiFItoKafkaConnect | 3 |
Sponsored