Testing Agents Before They Test You

Would you let a stranger handle your customer data?
Would you let a new hire talk to a client on their first day?
Would you put your kid in a self-driving car and just say "Have fun at school."

Then why do we trust our shiny new AI Agents to behave correctly in production without testing them?

In this talk, we share our journey of exploring how to evaluate Agentic Systems before and after deployment. We’ll walk through how to move from “it works in the demo” to trustworthy and observable systems that you can confidently run in production.

We’ll show practical examples of building evaluation pipelines, and how we experiment with simple, measurable ways to understand an agent’s behavior over time. We’ll share what we’ve learned so far, where things go wrong, what helps, and what’s still an open challenge as we build toward more mature evaluation practices.

Expect real experiences, not just theory. Expect live examples, and ideas you can take home to build trust into your own agents.

Jettro Coenradie

Fellow at Luminis working as Search and GenAI expert

Pijnacker, The Netherlands

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Testing Agents Before They Test You

Jettro Coenradie

Links

Actions