Fast & Safe: Implementing Guardrails (and Caching Them) for Agentic LLM Apps

Enterprise LLM apps must screen inputs before they ever hit a model. This talk is a practitioner’s blueprint for incoming guardrails: malicious intent/hacking detection, toxicity filtering, out-of-scope routing, medical-emergency/self-harm risk with escalation, language detection, and PII redaction.
We show where each check sits in the request path, how to tune thresholds to avoid false blocks, and how to cache guardrail outcomes to reduce latency and save LLM tokens. We’ll cover fail-open vs. fail-closed strategies, audit logging, and metrics, with examples from real-world agentic applications

Learning Objectives
* Implement a layered incoming guardrail pipeline: malicious/hacking, toxicity, out-of-scope, emergency/self-harm, language, PII.
* Design escalation paths (human-in-the-loop) for medical/self-harm and clean out-of-scope handoffs.
* Cache guardrail decisions safely: key design, cohorts, similarity thresholds, TTLs, version-aware invalidation.
* Choose fail-open vs. fail-closed behaviors; place checks to minimize added latency and token usage.
* Instrument audits and dashboards (hit/miss, P95/P99, token savings) to prove safety and performance.

Tags:
LLM, Guardrails, Agentic, Agents, Caching, Token Optimization, Architecture, Observability, LangGraph

Eyal Wirsansky

Senior data scientist, Artificial Intelligence mentor

Jacksonville, Florida, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Fast & Safe: Implementing Guardrails (and Caching Them) for Agentic LLM Apps

Eyal Wirsansky

Links

Actions