Operationalizing Agentic AI Safety & Evaluation for Multi-Agent Financial Systems

As financial AI shifts from passive models to autonomous agents, the industry faces a trust gap. Traditional "black box" validation is insufficient for systems executing complex workflows. This keynote explores the transition from MLOps to AgentOps, defining a standard for auditable Agentic AI where the decision process is as critical as the result.

We will dissect the FinSight Agent, a metacognitive FINOS Labs initiative built on LangGraph and MLflow, to demonstrate "Governance by Design." Aligning with the FINOS AI Evaluation Framework, we operationalize a "glass box" strategy. This moves beyond static benchmarks to implement trajectory tracing, where reasoning steps are audited against financial policies using LLM-as-a-Judge.

Finally, we cover system safety and supply chain security. We demonstrate how proactive Red Teaming detects risks like market manipulation and regulatory evasion. We also explore ensuring model integrity via the Model Openness Framework and Sigstore, proving open collaboration is key to building safe, compliant financial infrastructure.

Vincent Caldeira

Leading Open Source Technology Innovation for a Sustainable Future

Singapore

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Operationalizing Agentic AI Safety & Evaluation for Multi-Agent Financial Systems

Vincent Caldeira

Links

Actions