Our great sponsors
-
yottaStore
A datastore aiming at linear scalability up to the yottabyte range. Inspired by dynamo and cassandra.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
RoaringBitmap
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I copied this approach from several papers, with some improvements, for my datastore.
Looks like a job for GoLevelDB.
- bbolt for storage on disk. In order to get the smallest db file size possible make sure you insert the keys in order and set:
- a cuckoo filter for fast lookup. This has around a 3% false positive rate. There are other implementations however that have a much lower rate. You can store the filter in the database as well in a different bucket so you don't have to rebuild the filter on startup.
Use a two stage approach, with a bloom/cuckoo filter stored as a https://roaringbitmap.org/ in memory. Then a secondary key/value store on disk (bolt or anything else).
https://github.com/peterbourgon/diskv might be a solution
Most hash map (or set) implementations also overallocate quite a bit to reduce the number of collisions. You could use a custom map implementation that has a tuned load factor, that way you can trade speed for memory. You can have a look at the go map implementation to see how that could work: https://github.com/golang/go/blob/master/src/runtime/map.go With that said, unless you have a very good reason to go down that rabbit hole I'd avoid it.
https://github.com/cockroachdb/pebble Pure go SSD native key-value store. You could think of it as map[[]byte][]byte on persistent storage.
github.com/colinmarc/cdb is a Go implementation,