Chaitanya Deepthi Chadalavada
StarTree, Senior Software Engineer I
Actions
Deepthi is a Senior Software Engineer at StarTree and active contributor to Apache Pinot. She has extensively worked on Upserts and Deduplication in Pinot, addressing real-time data challenges at scale.
Area of Expertise
Topics
Have Your Real-time OLAP and Upsert It Too
Upserts (insert-or-update) are fundamental to OLTP systems — and their row-based, mutable data formats are built to support them. However, implementing upserts in OLAP systems present unique challenges due to their columnar, immutable data design optimized for analytical query workloads. This becomes even more complex in real-time OLAP systems, where freshness of seconds, ingestion rates of millions of events per second, and query throughput in the thousands (QPS) are the norm.
In today’s dynamic data environments, many use cases do require analytics on constantly changing, upserted views. For example, analytics on trip data for a ride-sharing app to make decisions about surge pricing, rerouting, or promotions — where the ride event keeps evolving (start, reroute, stop, tips, tolls) or customer profiles getting enriched over time via new events. If the OLAP database cannot support upserts in real time, it either pushes computation to the application or relies on asynchronous periodic refreshes. Both approaches compromise latency, flexibility, and data freshness. Most OLAP systems today that claim to support upserts only implement the latter — asynchronous periodic refresh.
Apache Pinot has native upserts support. Come to this talk to hear the story of how we built upserts for OLAP — and scaled it to support billions of primary keys per node, while maintaining ingestion freshness SLAs of seconds and ingesting hundreds of thousands of events per second.
We’ll dive into key questions such as:
– How do we efficiently manage billions of primary keys on a single node under constrained resources?
– How do we ensure fast recovery through restarts to maintain operational SLAs?
– How do we optimize costs as the table scales significantly?
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top