Session
Why Your SQL is Slow: Unmasking Hidden Performance Traps in Apache Pinot
As developers, we've all been there: you write a SQL query that looks perfectly clean and correct, but it runs unexpectedly slow. You check the execution plan and find the database has chosen a brute-force, inefficient path. This is a problem in single-node databases like Postgres, but it is even worse in distributed ones.
Apache Pinot is a distributed OLAP database that can run complex SQL expressions (including window functions and joins) in milliseconds across billions of rows. However, its optimizer applies complex relational algebra transformations while remaining blind to business invariants known only to humans (and sometimes not even them).
This means that a seemingly harmless function call, an innocent offset or limit, the subtle properties of NULL handling, or a slight change in a JOIN condition can unknowingly prevent these transformations, forcing the engine to abandon powerful optimizations. To make matters worse, ORMs like Hibernate are notorious for generating queries that fall into these exact traps.
While this talk uses Apache Pinot for its examples, the concepts are universal. Whether you're using ClickHouse, traditional single-node SQL databases, or even NoSQL databases like MongoDB, the principles for writing optimizer-friendly queries remain the same. Using concrete, real-world examples, I will demonstrate how seemingly equivalent SQL queries can produce wildly different execution plans. You'll leave this talk armed with the knowledge to spot these "optimization killers" and craft queries that work with the database, not against it, ensuring your analytics are always blazingly fast.
This talk tackles a universal developer pain point: writing a "perfect" SQL query that runs inexplicably slowly. Using the Apache Pinot distributed database as a practical case study, I will demystify why this happens.
The session dives into how developers unknowingly sabotage their own queries through subtle NULL semantics, innocent-looking functions, and ORM-generated code that breaks the relational algebra rules core to any modern optimizer.
This is a highly practical, "code-and-consequences" session that will give attendees a new mental model for writing performant SQL that applies not just to Pinot, but to any database they use.
Gonzalo Ortiz Jaureguizar
Performance engineer at Startree
Madrid, Spain
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top