Session

From PyTorch to Production: Serving a Physics-Constrained Generative Model with ONNX, Ray, and vLLM

Generative downscaling turns coarse climate fields into kilometer-scale maps, but diffusion and flow-matching models routinely break the physics they should respect: winds that diverge, negative rainfall, precipitation totals that drift from the coarse input. PC-RF is a conditional rectified-flow model in PyTorch that bakes three conservation constraints (divergence-free wind, non-negative precipitation, domain mass balance) straight into the training objective and fuses ERA5 atmosphere with Sentinel-2 imagery.

This talk follows one model from torch training to a served stack, and the engineering that broke along the way: exporting a U-Net with attention two ways (torch.onnx and AOTInductor) and why both specialized shapes until coaxed, running the rectified-flow ODE sampler outside the graph, enforcing physics at inference so non-negativity and mass-conservation errors hit exactly zero, and a Ray Train recipe for scaling. The serving path is polyglot: ONNX Runtime in Rust, Temporal in Go, and a vLLM agent that calls the downscaler as a tool. You leave with patterns for shipping scientific generative models, plus three physics-validity metrics worth adding to your eval.

Arun Sharma

PhD Researcher, Spatial AI and Physics-Informed ML, University of Minnesota

Palo Alto, California, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top