Session
Beyond the Basics: Advanced SCD2 Implementation for Compound Dimension Tables
Slowly Changing Dimensions (SCD) are widely known, and numerous blogs and tutorials cover how to implement them using languages like T-SQL, PySpark, or low-code approaches such as Dataflows and Pipelines. However, these examples typically focus on a single table—like a customer or product table. In my project, I faced a more complex scenario: my customer data was derived from multiple source tables, combining details such as customer names, addresses, statuses, partners, and more. Each source had different update frequencies, and it was crucial to track historical changes in customer data for driving personalized discounts, offers, and sales strategies based on status and partner relationships.
This session will demonstrate how I approached building a compound customer table, implementing SCD-Type 2 logic without resorting to full data reloads each time. Instead, I designed a delta load mechanism, ensuring only the changed data was processed. Using Microsoft Fabric and Notebooks, I solved the challenge of efficiently managing and updating this complex dataset. While the solution is showcased in Microsoft Fabric, the techniques can be applied across other environments. Join this session to learn how to handle multi-source dimensions with a practical, scalable approach that minimizes reprocessing and enhances data accuracy.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top