Session
Observability Is Not Enough: Reliability Engineering Using Digital Twins for AI-Native Systems
Current SRE practices were designed for human scale systems, where changes were infrequent and engineers could reason about failures after they occurred. That world no longer exists. AI coding assistants and automation pipelines now generate code and infrastructure changes at machine speed, while our reliability tooling remains fundamentally reactive. Observability tells us what happened, but it cannot tell us why it happened, or what will happen next.
In this talk, I will present why observability alone is insufficient for AI-native systems. Drawing on causal reasoning and Judea Pearl’s Ladder of Causation, I will talk about digital twins as executable world models for Reliability engineering. These twins integrate telemetry, infrastructure state, configuration, and policy into a causal graph that can be simulated before changes reach production. By moving reliability validation from post-incident analysis to pre-deployment simulation, SRE teams can predict blast radius, validate safety, and reason counterfactually about failures.
The future of SRE is not better dashboards, but rather simulation-driven, autonomous reliability.
Priya Ranjan Sahoo
Principal Engineer, Oracle America Inc.
San Francisco, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top