When LLMs Go Rogue: Securing Prompts and Ensuring Persona Fidelity

Even the most carefully crafted system prompts can “go rogue,” reverting to generic assistant mode or leaking hidden instructions, undermining security, consistency, and user trust. Drawing on hard-earned lessons from building a goal-oriented AI group chat platform, this session delivers:

* Multiple prompt-leakage and prompt-reversion examples showcasing real-world LLM failures
* Live demos of evaluation workflows, detecting and analysing rogue or unexpected responses in real time
* Practical security patterns for prompt engineering to mitigate leakage and fallback risks
* Techniques for adding nondeterministic evaluation tests into your deployment pipeline

This no-fluff, demo-driven talk equips engineers and security practitioners with battle-tested patterns to keep LLM-powered applications on-brand and secure. You’ll leave with open-source repos, threat-model templates, and actionable takeaways to implement immediately.

Ben Dechrai

Disaster Postponement Officer

Kansas City, Missouri, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

When LLMs Go Rogue: Securing Prompts and Ensuring Persona Fidelity

Ben Dechrai

Links

Actions