Trust, But Verify: Responsible AI and Evaluations in Microsoft Foundry

Shipping an AI feature is easy. Proving that it works, behaves safely, and stays that way over time is much harder.

Most teams treat evaluation as a one-time step before release. In practice, production AI systems require continuous measurement and enforcement. Microsoft Foundry provides the tools to do this properly, including safety evaluators, quality metrics such as groundedness and coherence, agent-level evaluation, and automated red teaming.

In this session, we’ll build a practical evaluation loop using the Azure AI Evaluation SDK and integrate it into a CI/CD pipeline. You’ll see how to detect regressions, measure real-world behavior, and treat responsible AI as an ongoing engineering discipline rather than a compliance exercise.

You’ll leave with a clear approach to implementing quality and safety evaluations, understanding the difference between pre-deployment and continuous evaluation, and connecting these practices into your delivery pipeline.

Roelant Dieben

Cloud architect @ Sopra Steria

Lopik, The Netherlands

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Trust, But Verify: Responsible AI and Evaluations in Microsoft Foundry

Roelant Dieben

Links

Actions