Session
CDC Pipelines on Iceberg: When Theory Meets Production Workloads
Change Data Capture (CDC) has become essential for moving data from transactional systems into lakehouse architectures. Apache Iceberg is increasingly adopted for this purpose, but teams often find it challenging to understand how Iceberg behaves in real CDC pipelines.
This session shares lessons learned from running CDC pipelines built on Apache Iceberg at scale at Nexon. We focus on common production scenarios and how they were addressed: how CDC pipelines behave under burst traffic and where bottlenecks emerge, how update and delete-heavy workloads require different handling than insert-only cases, and how design choices such as primary key handling and partitioning can lead to data duplication and operational overhead.
This talk bridges Iceberg concepts with production realities through CDC use cases.
Seungchul Lee
Nexon, Senior Data Engineer, Software Engineer, Streaming, Data Lakehouse
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top