Yash Mayya

Software Engineer at StarTree | Committer on Apache Kafka, Apache Pinot

Actions

Yash is currently a Software Engineer at StarTree and an Apache Pinot committer working on building the best possible real-time analytics database. Prior to this, he worked on the Kafka Connect ecosystem at Confluent and is also a committer on the open-source Apache Kafka project. Yash is an open-source enthusiast with a passion for distributed systems and data infrastructure.

Connecting offsets, fault tolerance, and delivery semantics in Kafka Connect

Offsets are ubiquitous in streaming data systems. The simplest analogy to understand offsets would be a video game where checkpoints allow you to avoid respawning from the beginning if your main character (Kafka Consumer, Connector, Streams application etc.) dies at any point. However, as is always the case with distributed systems, there is a lot more going on under the hood.

During this session, we will explore how offsets are represented, stored and used for source and sink Kafka connectors. We will also discuss how fault tolerance is achieved in Kafka Connect. This will be followed by a deep dive on delivery semantics in Kafka Connect and why they’re intrinsically linked with offsets - including how and when we can achieve the holy grail of exactly-once delivery semantics.

Audience members can expect to learn about the nitty-gritty details of a distributed system like Kafka Connect and the patterns that allow us to accomplish complex goals like fault tolerance and exactly-once delivery semantics which are essential to building robust data pipelines.

Mastering Multi-Stage Query Performance in Apache Pinot

Apache Pinot has become a cornerstone for real-time analytics, enabling organizations to deliver low-latency insights at scale. At the forefront of this evolution recently has been Pinot’s multi-stage query engine, a transformative innovation that unlocks new possibilities for advanced analytics use cases. This session explores the journey of Pinot’s multi-stage query engine, tracing its development, key milestones, and future roadmap.

First introduced in version 1.0 primarily to enable query-time joins, the engine has since evolved to support a wide array of features such as window functions, funnel analytics, enhanced debuggability with query stats and comprehensive explain plans, and numerous performance optimizations while continuing to steadily progress toward full standard SQL semantics.

With Pinot 1.2.0 you can now utilize advanced query statistics to diagnose bottlenecks. With Pinot 1.3.0, a new multi-stage query explain plan uncovers how queries are executed. Finally, you'll discover how to leverage the reuse CTE feature recently introduced into Pinot to streamline complex queries and optimize resource utilization.

Learn practical strategies for performance optimization using the mechanics of join strategies and how to choose the most efficient approach for your use case.

Yash Mayya

Software Engineer at StarTree | Committer on Apache Kafka, Apache Pinot

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Yash Mayya

Actions

Links

Sessions

Connecting offsets, fault tolerance, and delivery semantics in Kafka Connect

Mastering Multi-Stage Query Performance in Apache Pinot

Yash Mayya

Links

Actions