Designing a Cloud-Edge Data Backbone for Physical AI Systems

Physical AI systems in robotics, industrial automation, and other real-world environments rely on continuous telemetry from sensors, machines, and human-in-the-loop actions. Unlike cloud-native software systems, these signals represent irreversible real-world events that cannot be reconstructed if they are not captured when they occur. However, many production data pipelines still assume that data can be recomputed or backfilled, which leads to irreproducible training datasets and blind spots in debugging and drift analysis.

This session presents a production cloud-edge data architecture that treats telemetry as immutable historical truth and unifies raw sensor data, inference metadata, and operational events into a time-aligned, append-only record. The architecture separates capture correctness from downstream compute, preserves late-arriving data, and enables reproducible reconstruction of training datasets. Practical workflows for model drift debugging and historical dataset reproduction are discussed, along with trade-offs in storage cost and operational complexity.

An Phan

Senior Data Infrastructure Engineer @ Hippo Harvest

San Jose, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Designing a Cloud-Edge Data Backbone for Physical AI Systems

An Phan

Links

Actions