Session
Spark Performance Engineering: Tuning, Optimization, Debugging & Beyond
Take your Spark expertise to the next level with a systematic approach to performance engineering that transforms how you build, tune, and debug production workloads. This intensive workshop moves beyond the basics to help attendees develop expert-level skills in performance optimization and troubleshooting complex production issues. This workshop will deep dive into the following topics, all in the context of hands-on labs that surround mock streaming Lego manufacturing and sales data.
- Execution Architecture: Understanding Spark's query planning, Fabric's Native Execution Engine, Delta Lake internals, and distributed execution patterns that inform all tuning decisions.
- Performance Diagnostics: Reading Spark UI like an expert, interpreting metrics and logs, identifying bottlenecks, and establishing performance baselines.
- Systematic Tuning Methodology: A hierarchy-based approach from table features and physical design through session configurations.
- Optimization Patterns: DataFrame transformations, join strategies, caching use cases, resource allocation, Adaptive Query Execution, and streaming optimizations.
- Advanced Debugging: Diagnosing OOMs, data skew, spill issues, and storage problems with proven troubleshooting tips, tricks, and best practices.
Platform Context: While focused on Spark in Microsoft Fabric, the core concepts apply universally across all Spark platforms.
Prerequisites: Spark fundamentals including DataFrames/SQL, basic understanding of distributed systems, and experience building data pipelines. Attendees must have existing experience using Spark.
Outcome: A systematic toolkit for optimizing any Spark workload, debugging production issues efficiently, and designing high-performance data solutions that scale.
Miles Cole
Spark and Lakehouse Evangelist | Principal Program Manager @ Microsoft, Fabric CAT
Littleton, Colorado, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top