Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
If the OP has time, it might be nice to experiment with the hash table technique used by sets in Python. It has a variant of open addressing the runs several linear probes before jumping to another location in hash table.
The premise is that probing consecutive memory addresses is cheap because of cache locality and because the address computation is a simple increment. This should be cheaper than resolving hash collisions with additional random probes or by chasing linked list references to random memory locations for each link.
https://github.com/python/cpython/blob/f840398a5fd8741653c26...