-
substrait
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
-
velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
4. The workers (if distributed) or the engine (if single-box) executing the execution plan, by performing the actual computations on actual bytes.
(Note that this is a cartoon version and any given engine will have differences, eg Presto/Trino does not have a clear distinction between LP and Physical Plan.)
1) and 4) are broadly similar across engines, while 2) and 3) vary widely partially because of different requirements (reliability, latency, etc). Projects such as Apache Arrow and Velox (https://github.com/facebookincubator/velox) are making common tools for 4), and as mentioned ANSI SQL, ZetaSQL, Calcite, and Substrait are making common tools for 1).