Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I wonder if dormando who sometimes comes around would care to run memcache with the same traces as are used in this paper, which are available at https://github.com/twitter/cache-trace. I'm not sure I care about a cache that can scale to 24 cores, as in my experience I usually end up with hundreds of caches each with a few cores rather than fewer, bigger cache servers, but it still would be interesting to see what memcached can do.
According to https://github.com/Cyan4973/xxHash, the best hash function can only do 100s M hashes per second, how can a local cache run at such throughput? I assume when measuring cache throughput, one need to calculate hash, look up, (maybe compare keys), and copy the data.
A multi-threaded benchmark of a cache should be fully populated and use a scrambled Zipfian distribution. This emulates hot/cold entries and highlights the areas of contention (locks, CASes, etc). A lock-free read benefits thanks to cpu cache efficiency causing super linear growth.
This shows if the implementation could be a bottleneck and scales well enough, after which the hit rate and other factors are more important than raw throughput. I would rather sacrifice a few nanos on a read than suffer much lower hit rates or have long pauses on a write due to eviction inefficiencies.
[1] https://github.com/ben-manes/caffeine/wiki/Benchmarks#read-1...