Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
There are probably a bunch of reasons, which is why I want an easy "run benchmarks" command that I can use. I'd even be fine using infra so long as I had pulumi/terraform to set it all up for me.
I just don't want to spin up EC2 instances manually, get the connections all working, make sure I can reset state, etc.
I already have a fork of Scylla where I removed a lot of unnecessary cloning of `String` but no way I'm gonna PR it without a benchmark.
I also opened a PR to replace the hash algorithm used in their PreparedStatement cache, which gets hit for every query, but they wanted benchmarks before accepting (completely fair) and I have none. `ahash` is extremely fast compared to Rust's default - https://github.com/tkaitchuck/ahash and with the `comptime` randomness (more than sufficient for the scylla use case) you can avoid a system call when creating the HashMap.
This is the best, most comprehensive hash test suite I know of: https://github.com/rurban/smhasher/
you might want to particularly look into murmur, spooky, and metrohash. I'm not exactly sure of what the tradeoffs involved are, or what your need is, but that site should serve as a good starting point at least.
You might want to check out ScyllaDB Stress Orchestrator. Not sure of the current state of the code, but it's meant to do what you are talking about:
https://github.com/scylladb/scylla-stress-orchestrator/wiki/...
Do you mean this? https://github.com/jonhoo/left-right
I am not sure of the performance or implementation difficulty but the data structure seems to be what you are talking about.