From RAG to Reliable Agents: An Open Source Playbook for Evaluation, Guardrails, and LLMOps

Teams are moving from Retrieval-Augmented Generation (RAG) to agentic workflows that plan, call tools, and take actions. The hard part is no longer making a demo work; it is making behavior reliable, safe, and observable in production.

This session presents a practical, open-source “Day 2” playbook for building trustworthy agents. We cover three pillars:

Offline evaluation: automated eval harnesses using heuristic metrics (groundedness/faithfulness, relevancy) plus agent-specific checks like tool-call correctness and step success rate, with regression gates before release.

Runtime guardrails: interceptors to prevent prompt injection impact, sensitive data leakage, unsafe outputs, and unauthorized tool actions via allowlists, policy checks, and redaction.

LLMOps and observability: tracing and structured telemetry to debug multi-turn tool execution, localize failures (retrieval vs planning vs tool), and monitor drift, latency, and cost.

Attendees leave with a reference architecture, metric checklist, and implementation patterns using open-source components (e.g., Ragas/DeepEval for evals, guardrail libraries, Langfuse/OpenTelemetry-style tracing).

Puspanjali Sarma

Hyderābād, India

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

From RAG to Reliable Agents: An Open Source Playbook for Evaluation, Guardrails, and LLMOps

Puspanjali Sarma

Links

Actions