AgentOps for Real: Observability, Evals, and Control for Production AI Agents

AI agents behave less like traditional software functions and more like probabilistic autonomous systems. They call tools, chain reasoning steps, and adapt their behavior based on model outputs, which means conventional testing and monitoring approaches break down quickly.

So how do engineering teams safely operate agents in production? How do you detect behavioral drift after model or prompt updates? Prevent silent regressions in tool-using agents? Debug complex multi-step executions? Monitor real-world behavior after deployment? And most importantly, how do you regain control when an agent starts going off the rails?

In this session, we’ll walk through a practical AgentOps lifecycle for production AI agents, including evaluation datasets, golden prompts, behavioral test suites, hybrid scoring methods, trace-driven debugging, regression testing for multi-step agents, CI/CD quality gates, and production monitoring and control patterns.

Using modern cloud tooling and .NET-based agent frameworks, we’ll explore repeatable techniques for testing, observing, and operating AI agents as real software systems; not magic. Attendees will leave with a practical framework they can apply to improve reliability, visibility, and control in their own agent-based solutions.

Brian Haydin

Microsoft Cloud & AI Architect | Azure • GitHub • Power Platform | Demo-first sessions (US + Europe)

Milwaukee, Wisconsin, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

AgentOps for Real: Observability, Evals, and Control for Production AI Agents

Brian Haydin

Links

Actions