Our great sponsors
-
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Checkout https://trino.io (formerly Presto) but is what Athena is based off of. Essentially parallelism allows for this so there’s many worker nodes all reading from S3. You can also run Presto on EMR which is sort of fun looking at the admin UI because it will show you how it breaks the query into parts and fans the work out to worker nodes. Pretty cool because if allowed to (from a resource management perspective), Presto will try to saturate the entire cluster CPU resources to compete the query as fast as possible.
Related posts
- Trino: Fast distributed SQL query engine for big data analytics
- Game analytic power: how we process more than 1 billion events per day
- Your Thoughts on OLAPs Clickhouse vs Apache Druid vs Starrocks in 2023/2024
- Trino, a open query engine that runs at ludicrous speed
- Questions about Athena, Trino and Iceberg