How does AWS Athena manage to load 10GB/s from s3? I've managed 230 mb/s from a c6gn.16xlarge

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Trino

44 9,552 10.0 Java

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Checkout https://trino.io (formerly Presto) but is what Athena is based off of. Essentially parallelism allows for this so there’s many worker nodes all reading from S3. You can also run Presto on EMR which is sort of fun looking at the admin UI because it will show you how it breaks the query into parts and fans the work out to worker nodes. Pretty cool because if allowed to (from a resource management perspective), Presto will try to saturate the entire cluster CPU resources to compete the query as fast as possible.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Trino: Fast distributed SQL query engine for big data analytics
1 project | news.ycombinator.com | 19 Mar 2024
Game analytic power: how we process more than 1 billion events per day
1 project | dev.to | 24 Nov 2023
Your Thoughts on OLAPs Clickhouse vs Apache Druid vs Starrocks in 2023/2024
2 projects | /r/dataengineering | 16 Nov 2023
Trino, a open query engine that runs at ludicrous speed
1 project | news.ycombinator.com | 11 Jul 2023
Questions about Athena, Trino and Iceberg
2 projects | /r/dataengineering | 15 Jun 2023

How does AWS Athena manage to load 10GB/s from s3? I've managed 230 mb/s from a c6gn.16xlarge

This page summarizes the projects mentioned and recommended in the original post on /r/aws
Projects Database Java Presto Hive
Post date: 16 Mar 2021

Trino

InfluxDB

Related posts

How does AWS Athena manage to load 10GB/s from s3? I've managed 230 mb/s from a c6gn.16xlarge

This page summarizes the projects mentioned and recommended in the original post on /r/aws Projects Database Java Presto Hive Post date: 16 Mar 2021

Trino

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/aws
Projects Database Java Presto Hive
Post date: 16 Mar 2021