napkin-math
ristretto
napkin-math | ristretto | |
---|---|---|
13 | 19 | |
3,093 | 5,354 | |
- | 1.6% | |
6.3 | 6.1 | |
12 days ago | about 2 months ago | |
Rust | Go | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
napkin-math
- capacity planning in system design interviews
- Napkin Math
-
S3 Express Is All You Need
Most production storage systems/databases built on top of S3 spend a significant amount of effort building an SSD/memory caching tier to make them performant enough for production (e.g. on top of RocksDB). But it's not easy to keep it in sync with blob...
Even with the cache, the cold query latency lower-bound to S3 is subject to ~50ms roundtrips [0]. To build a performant system, you have to tightly control roundtrips. S3 Express changes that equation dramatically, as S3 Express approaches HDD random read speeds (single-digit ms), so we can build production systems that don't need an SSD cache—just the zero-copy, deserialized in-memory cache.
Many systems will probably continue to have an SSD cache (~100 us random reads), but now MVPs can be built without it, and cold query latency goes down dramatically. That's a big deal
We're currently building a vector database on top of object storage, so this is extremely timely for us... I hope GCS ships this ASAP. [1]
[0]: https://github.com/sirupsen/napkin-math
-
Random Read or Sequential Read
Trying to estimate performance using some napkin math based on this: https://github.com/sirupsen/napkin-math
-
A CVE has been issued for hyper. Denial of Service possible
So napkin maths time. Typical cross-world bog-standard network speeds for a single TCP channel of ~25MiBps. A single HEADERS+RST pair is likely < 128 bytes (40 for the HEADERS + whatever payload, and 32 for the RST). So 8 pairs per K, 8K pairs per MiB, 200K pairs per 25MiB...
- Index Merges vs Composite Indexes in Postgres and MySQL
-
I/O is no longer the bottleneck
Yes, sequential I/O bandwidth is closing the gap to memory. [1] The I/O pattern to watch out for, and the biggest reason why e.g. databases do careful caching to memory, is that _random_ I/O is still dreadfully slow. I/O bandwidth is brilliant, but latency is still disappointing compared to memory.
[1]: https://github.com/sirupsen/napkin-math
- Monthly cost to host server for 1M DAUs?
- Napkin-math: Techniques and numbers for estimating system's performance
-
System Design prep?
https://github.com/sirupsen/napkin-math (memorize these)
ristretto
-
Otter, Fastest Go in-memory cache based on S3-FIFO algorithm
1. Unfortunately, ristretto has been showing hit ratio around 0 on almost all traces for a very long time now and the authors don't respond to this in any way. Vitess for example has already changed it to another cache. Here are two issues about it: https://github.com/dgraph-io/ristretto/issues/346 and https://github.com/dgraph-io/ristretto/issues/336. That is, ristretto shows such results even on its own benchmarks. You can see it just by running hit ratio benchmarks on a very simple zipf distribution from the ristretto repository: https://github.com/dgraph-io/ristretto/blob/main/stress_test.... On this test I got the following:
-
S3 Express Is All You Need
That's exactly how Userify[0] used to work. (when it was Python; now that it's a Go app, we do the caching in memory using Ristretto[1]).
0. https://userify.com (team ssh key management/sudo authz)
1. https://github.com/dgraph-io/ristretto
-
Theine - High performance in-memory cache
I also do some hit ratio benchmarks and Theine's results are much better than Ristretto. See results in README: https://github.com/Yiling-J/theine-go#hit-ratios
-
Python deserves a good in-memory cache library!
If you know Caffeine(Java)/Ristretto(Go)/Moka(Rust), you know what Theine is. Python deserves a good in-memory cache library.
-
VCache: A Simple In-Memory Cache Library
Thanks for sharing. There are a lot of options for embedded in-memory caches: https://github.com/dgraph-io/ristretto https://awesome-go.com/caches/ Do you have any comparisons or details on how your project has a different approach?
-
Cacheme: Asyncio cache framework with multiple storages and thundering herd protection
I made Cacheme years ago, which support redis and synchronous API only. Then I switch to Go and found that there are some awesome cache projects in Go(ristretto, gocache...), I also made my own Cacheme go version: cacheme-go. After trying asyncio and type hint, I think it's time to rewrite my old Cacheme.
-
Show HN: Zcached, in-memory key-value cache wire-compatible with memcached
zcached is an in-memory key-value cache exposing a memcached ASCII protocol-compatible interface, built on pluggable cache engines like Ristretto and freecache [0].
It's not performance-competitive with memcached, especially at higher thread counts. That said, it achieves about 1.1M ops/s, but at significantly higher P99 and P999 latency (as measured by memtier). See [1] and [2] for benchmark results from my 7950x-based workstation.
Disclaimer: This is a hobby project created for fun while hacking over the holidays. zcached is not a commercial product and never will be. Don't use it in production; consider this a technology demo more than anything.
I don't expect the source code to build outside of my environment, but for those interested in playing with it, binary artifacts are available at [3]. Try `zcached --address tcp:localhost:11211`.
[0] https://github.com/dgraph-io/ristretto, https://github.com/coocood/freecache
- What is the coolest Go open source projects you have seen?
-
Quitting Dgraph Labs
While I never used dgraph, I do use badger and ristretto and am similarly in a bind over their long-term survival (moreso badger than ristretto)...
-
Recommendation for Key/Value storage
There are also different packages used as a wrapper on top of the Go map based on what your requirements are (storing a lot of data) https://github.com/allegro/bigcache or (need performance) https://github.com/dgraph-io/ristretto. For basic use-cases, the standard Go map should be enough. Just keep in mind whether you need concurrent access to your data structure, in which case you should guard your map with a mutex .
What are some alternatives?
huniq - Filter out duplicates on the command line. Replacement for `sort | uniq` optimized for speed (10x faster) when sorting is not needed.
go-cache-benchmark - Cache benchmark for Golang
advisory-database - Security vulnerability database inclusive of CVEs and GitHub originated security advisories from the world of open source software.
BigCache - Efficient cache for gigabytes of data written in Go.
adix - An Adaptive Index Library for Nim
stretto - Stretto is a Rust implementation for Dgraph's ristretto (https://github.com/dgraph-io/ristretto). A high performance memory-bound Rust cache.
h2 - HTTP 2.0 client & server implementation for Rust.
moka - A high performance concurrent caching library for Rust
RAMCloud - **No Longer Maintained** Official RAMCloud repo
parquet-go - Go library to read/write Parquet files
simdjson - Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
IceFireDB - @IceFireLabs -> IceFireDB is a database built for web3.0 It strives to fill the gap between web2 and web3.0 with a friendly database experience, making web3 application data storage more convenient, and making it easier for web2 applications to achieve decentralization and data immutability.