Session
Fast & Safe: Implementing Guardrails (and Caching Them) for Agentic LLM Apps
Enterprise LLM apps must screen inputs before they ever hit a model. This talk is a practitioner’s blueprint for incoming guardrails: malicious intent/hacking detection, toxicity filtering, out-of-scope routing, medical-emergency/self-harm risk with escalation, language detection, and PII redaction.
We show where each check sits in the request path, how to tune thresholds to avoid false blocks, and how to cache guardrail outcomes to reduce latency and save LLM tokens. We’ll cover fail-open vs. fail-closed strategies, audit logging, and metrics, with examples from real-world agentic applications
Learning Objectives
* Implement a layered incoming guardrail pipeline: malicious/hacking, toxicity, out-of-scope, emergency/self-harm, language, PII.
* Design escalation paths (human-in-the-loop) for medical/self-harm and clean out-of-scope handoffs.
* Cache guardrail decisions safely: key design, cohorts, similarity thresholds, TTLs, version-aware invalidation.
* Choose fail-open vs. fail-closed behaviors; place checks to minimize added latency and token usage.
* Instrument audits and dashboards (hit/miss, P95/P99, token savings) to prove safety and performance.
Tags:
LLM, Guardrails, Agentic, Agents, Caching, Token Optimization, Architecture, Observability, LangGraph

Eyal Wirsansky
Senior data scientist, Artificial Intelligence mentor
Jacksonville, Florida, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top