Simulated Worlds for Agent Engineering: Planning, Policy, and Evaluation

As AI agents move beyond single-prompt demos into multi-step, autonomous workflows, teams face a core challenge: how do we design, test, and govern agent behavior before deploying it into real systems?

This talk presents simulated worlds as a practical engineering tool for building and evaluating agent systems. Using a controlled, Doom-like simulation environment, we explore how agents can plan actions, coordinate with other agents, operate under explicit policy constraints, and be evaluated over long-running interactions.

Rather than focusing on game mechanics, the simulation is used as a safe, repeatable laboratory for agent engineering. We’ll examine how structured environments help surface failure modes early, make agent behavior observable, and enable meaningful evaluation beyond simple task success.

Topics include:

- Designing agent planning and decision loops in constrained environments
- Enforcing policy and safety boundaries at runtime
- Multi-agent coordination and conflict handling
- Behavioral and lifecycle-based evaluation metrics for agents
- Lessons learned translating simulation insights to real workflows

This session is framework-agnostic and aimed at developers and architects who want reliable, testable, and governable agent systems, not just clever prompts.

Alexander Chernov

🤖 Link-Think-Act · Associate Principal Data Engineer @ AstraZeneca · M.Sc. Physics · M.Sc. Information and Communications Engineering

Toronto, Canada

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Simulated Worlds for Agent Engineering: Planning, Policy, and Evaluation

Alexander Chernov

Links

Actions