-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
rocksdb-cloud
A library that provides an embeddable, persistent key-value store for fast storage optimized for AWS
-
tantivy
Discontinued Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust [Moved to: https://github.com/quickwit-oss/tantivy] (by quickwit-inc)
Combining data-at-rest with some slim index structure coupled with a common access method (like HTTP) was the idea behind a tool a key-value store for JSON I once wrote: https://github.com/miku/microblob
I first thought of building a custom index structure, but found that I did not need everything in memory all the time. Using an embedded leveldb works just fine.
> If you think more about this, it will be like distributed key value store with support both disk and memory access. You can write one using some opensource Raft libraries, or a possible candidate is Tikv from PingCap
My whole point was not building it ;)
There's also https://github.com/NVIDIA/aistore
How big is each document ? If documents are big, keep each of them as a separate file and store the ids in a database. If documents are small, then you want something like https://github.com/rockset/rocksdb-cloud for a building block
What we store on S3 is a regular tantivy index and another tiny data structure that we call "turbo index", which makes queries faster on object storages. For this demo, the tantivy indexes are fairly large and we issue HTTP Range requests against them.
https://github.com/tantivy-search/tantivy