-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
What is not entirely obvious here is that under the hood you can install a nice library called numexpr (docs, src) that exists to make calculations with large NumPy (and pandas) objects potentially much faster. When you use query or eval, this expression is passed into numexpr and optimized using its bag of tricks. Expected performance improvement can be between .95x and up to 20x, with average performance around 3-4x for typical use cases. You can read details in the docs, but essentially numexpr takes vectorized operations and makes them work in chunks that optimize for cache and CPU branch prediction. If your arrays are really large, your cache will not be hit as often. If you break your large arrays into very small pieces, your CPU won’t be as efficient.