Session

A Case Study in API Cost of Running Analytics in the Cloud at Scale with an Open-Source Data Stack

The migration of data-intensive analytics applications to cloud-native environments promises enhanced scalability and flexibility but introduces complex cost models that pose new challenges to traditional optimization strategies. While on-premises setups focused on speed, cloud deployments require a more nuanced approach, factoring in cloud storage operations costs, which can escalate rapidly in real-world scenarios.

In this presentation, Bin will analyze these challenges through a case study on Uber's large deployment analytics SQL platform on HDFS and GCS. They will show their findings of unexpected cost implications with standard I/O optimizations like table scans, filters, and broadcast joins when implemented in cloud environments. He will also highlight the need for a paradigm shift in optimizing data-intensive applications for the cloud and advocate for developing new I/O strategies, balancing performance and costs while tailored to cloud ecosystems' unique demands.

Bin Fan

Founding Engineer, Alluxio

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top