Observability Is Not Enough: Reliability Engineering Using Digital Twins for AI-Native Systems

Current SRE practices were designed for human scale systems, where changes were infrequent and engineers could reason about failures after they occurred. That world no longer exists. AI coding assistants and automation pipelines now generate code and infrastructure changes at machine speed, while our reliability tooling remains fundamentally reactive. Observability tells us what happened, but it cannot tell us why it happened, or what will happen next.

In this talk, I will present why observability alone is insufficient for AI-native systems. Drawing on causal reasoning and Judea Pearl’s Ladder of Causation, I will talk about digital twins as executable world models for Reliability engineering. These twins integrate telemetry, infrastructure state, configuration, and policy into a causal graph that can be simulated before changes reach production. By moving reliability validation from post-incident analysis to pre-deployment simulation, SRE teams can predict blast radius, validate safety, and reason counterfactually about failures.

The future of SRE is not better dashboards, but rather simulation-driven, autonomous reliability.

Priya Ranjan Sahoo

Principal Engineer, Oracle America Inc.

San Francisco, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Observability Is Not Enough: Reliability Engineering Using Digital Twins for AI-Native Systems

Priya Ranjan Sahoo

Links

Actions