garage
tantivy
garage | tantivy | |
---|---|---|
41 | 48 | |
369 | 9,955 | |
6.2% | 2.2% | |
9.7 | 9.1 | |
12 days ago | 4 days ago | |
Rust | Rust | |
GNU Affero General Public License v3.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
garage
-
SeaweedFS fast distributed storage system for blobs, objects, files and datalake
Take a look at GarageS3, it's a niceoption for "just an S3 server" for self hosting.
https://garagehq.deuxfleurs.fr/
I use it for self hosting.
-
A case for moving away from the cloud and embracing local storage solutions
Garage (https://garagehq.deuxfleurs.fr/) gets pretty close for object storage. It’s built with mixing high/low latency links and replication between multiple hosts. Unfortunately it’s not really built for end-users, but devs, so there’s no ui or anything like that.
-
Show HN: OpenSign – The open source alternative to DocuSign
> Theoretically they could swap with minio but last time we used it it was not a drop-in replacement yet.
Depends on whether AGPL v3 works for you or not (or whether you decide to pay them), I guess: https://min.io/pricing
I've actually been looking for more open alternatives, but haven't found much.
Zenko CloudServer seemed to be somewhat promising, but doesn't seem to be managed very actively: https://github.com/scality/cloudserver/issues/4986 (their Docker images on DockerHub were last updated 10 months ago, which is what the homepage links to; blog doesn't seem active since 2019, forums don't have much going on, despite some action on GitHub still)
There was also Garage, but that one is also AGPL v3: https://garagehq.deuxfleurs.fr/
The closest I got was discovering that SeaweedFS has an S3 compatible mode: https://github.com/seaweedfs/seaweedfs
-
Local-first software: You own your data, in spite of the cloud (2019)
Ah, you should check out Garage (https://garagehq.deuxfleurs.fr/) for a self-hosted, cluster-y API of S3
- Object storage - "we are finally building it"
-
Canva saves millions annually in Amazon S3 costs
I'm a big fan of Garage[1], which is a dead-simple S3 drop-in that you can host on your own drives. It's designed for consumer hardware with shitty internet in-between nodes.
[1]:https://garagehq.deuxfleurs.fr/
-
Quickwit 0.6.0 - Search and analytics on billions of logs with minimal hardware
One more thing we are also proud of: a bunch of our users is using the object storage Garage, this OSS project looks really promising, and we really cherish the OSS for this kind of unexpected combination.
-
Show HN: Quickwit – Cost-efficient Elasticsearch alternative on object storage
- Another nice comment seen on HN « it seems to be very easy to run, not very IO intensive, and running fine on a single node with modest hardware with >2 billion log rows. It has a really cool dynamic schema feature too.» [9]
Fun fact: at least 4 users are using Garage[10] as the object storage, this OSS project looks really promising and made the HN front page a few months ago[11], we really cherish the OSS for this kind of unexpected combination.
Any feedback positive/negative always greatly appreciated here!
[0] Quickwit repo: https://github.com/quickwit-oss/quickwit
[1] Searching the web under 1000$/month: https://news.ycombinator.com/item?id=27074481
[2] Chitchat gossip library: https://github.com/quickwit-oss/chitchat
[3] Columnar format: https://github.com/quickwit-oss/tantivy/tree/main/columnar
[4] Tantivy library: https://github.com/quickwit-oss/tantivy/
[5] Whichlang library: https://github.com/quickwit-oss/whichlang
[6] GitHub Archive demo in terminal: https://www.youtube.com/watch?v=SNq3bARRlDI
[7] Indexing performance: https://twitter.com/fulmicoton/status/1638016949459488768
[8] https://twitter.com/arnonrgo/status/1645429632303235073?s=20
[9] https://news.ycombinator.com/item?id=35742544
[10] Garage object storage: https://garagehq.deuxfleurs.fr/
[11] https://news.ycombinator.com/item?id=33853539
-
The NixOS Foundation’s Call to Action: S3 Costs Require Community Support
On the technical side, garage (https://garagehq.deuxfleurs.fr/) does multi master replication by default, so is probably better for this use case. Still with S3 API.
-
Looking for a solution to merge storage accross WAN
You are looking for garage.
tantivy
-
SeekStorm VS tantivy - a user suggested alternative
2 projects | 22 Mar 2024
-
What is Hybrid Search?
Tantivy - a full-text indexing library written in Rust. Has a great performance and featureset.
- Tantivy – Fast, OSS full-text search library in Rust
-
RAG Using Unstructured Data and Role of Knowledge Graphs
By this I presume you mean build a search index that can retrieve results based on keywords? I know certain databases use Lucene to build a keyword-based index on top of unstructured blobs of data. Another alternative is to use Tantivy (https://github.com/quickwit-oss/tantivy), a Rust version of Lucene, if building search indices via Java isn't your cup of tea :)
Both libraries offer multilingual support for keywords, I believe, so that's a benefit to vector search where multilingual embedding models are rather expensive.
-
Show HN: Quickwit – OSS Alternative to Elasticsearch, Splunk, Datadog
We also implemented our schemaless columnar storage optimized for object storage.
The inverted index and columnar storage are part of tantivy [0], which is the fastest search library out there. We maintain it and we decided to build the distributed engine on top of it.
[0] tantivy github repo: https://github.com/quickwit-oss/tantivy
-
Pg_bm25: Elastic-Quality Full Text Search Inside Postgres
The issue for geo search is here: https://github.com/quickwit-oss/tantivy/issues/44
-
Grimoire - A recipe management application.
Search index : Custom-built using tantivy.
-
A Compressed Indexable Bitset
The roaring bitmap variant is used only for the optional index (1 docid => 0 or 1 value) in the columnar storage (DocValues), not for the inverted index. Since this is used for aggregation, some queries may be a full scan.
The inverted index in tantivy uses bitpacked values of 128 elements with a skip index on top.
> I didn't follow the rest of your comment, select is what EF is good at, every other data structure needs a lot more scanning once you land on the right chunk. With BMI2 you can also use the PDEP instruction to accelerate the final select on a 64-bit block
The select for the sparse codec is a [simple array index access](https://github.com/quickwit-oss/tantivy/blob/main/columnar/s...), that is hard to beat. Compression is not good near the 5k threshold though.
-
Job: Rust + Retrieval Systems at Etsy
Hi /r/rust, I’m a SWE on Etsy’s Retrieval Systems team where we’re building a platform based on rust and tantivy (https://github.com/quickwit-oss/tantivy). We’re looking to bring two new engineers onto the team.
-
Announcing Velo - Your Rust-Powered Brainstorming and Note-Taking Tool
Quick Search: Easily find specific notes with Velo's fuzzy-search feature, powered by tantivy. tantivy might have been a little overkill, but it was really easy to integrate.
What are some alternatives?
seaweedfs - SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.
sonic - 🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
ceph-containers - OCI compliant Ceph Container Images based on Ubuntu LTS
surrealdb - A scalable, distributed, collaborative, document-graph database, for the realtime web
Zenko - Zenko is the open source multi-cloud data controller: own and keep control of your data on any cloud.
milli - Search engine library for Meilisearch ⚡️
s3ql - a full featured file system for online data storage
MeiliSearch - A lightning-fast search API that fits effortlessly into your apps, websites, and workflow
Nebula - A scalable overlay networking tool with a focus on performance, simplicity and security
quickwit - Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Seaweed File System - SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. [Moved to: https://github.com/seaweedfs/seaweedfs]
fselect - Find files with SQL-like queries