From Zero to One: Building a Petabyte-Scale Data Analytics Platform with Apache Iceberg™

Apache Iceberg™ is transforming modern data architecture by providing the efficiency and flexibility of a managed data warehouse without the vendor lock-in. At TRM Labs, our data platform has traditionally relied on BigQuery and distributed Postgres to serve queries over terabyte-sized datasets for external, customer-facing consumption with low latency and high concurrency; this solution proved to be both expensive and limiting. We made a bold move: adopting Iceberg at the core of a petabyte-scale data lakehouse to power external, user-facing analytics.

In this session, we will discuss why your organization should consider adopting Iceberg. We will cover how to benchmark it against other table formats to power a high-performance, low-latency analytics platform and key architectural decisions across data ingestion, data modeling, compute optimization, and data operations that can enable efficient scaling. Additionally, we will share performance-tuning techniques, including clustering and advanced data and metadata caching, that helped us improve query efficiency and reduce compute and storage costs. If you are looking for practical guidance on building a roadmap for adopting a lakehouse, we will also share suggestions and lessons learned.

Whether you’re considering Iceberg or scaling an existing implementation, this session will equip you with actionable insights to build a long-term, high-performance analytics strategy.

Vijay Shekhawat

Staff Software Engineer - Data at TRM Labs

Bristol, United Kingdom

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

From Zero to One: Building a Petabyte-Scale Data Analytics Platform with Apache Iceberg™

Vijay Shekhawat

Links

Actions