Loop, Rinse, Repeat: The Self-Amplifying Agent Attack Prompt Hardening Won't Stop

One poisoned calendar invite. One agent that read it, believed it, and acted on it. And then acted on it again. And again. Five unauthorized wire transfers totaling $250,000 in under ten seconds, with nothing in the architecture to make it stop. Not a theoretical scenario. A reproducible result from a controlled multi-agent simulation, and evidence of a threat category that does not yet have a name: the Cascading Amplifier.

A human insider is naturally bounded by fatigue, hesitation, and the friction of organizational life. An autonomous agent reasoning loop has none of that. Give it a consequential tool, an iterative loop, and no per-action authorization gate, and a single injected instruction just keeps executing. Loop, rinse, repeat, until the damage is done.

This talk presents findings from seven controlled simulation runs across two enterprise attack scenarios: a Confused Deputy financial fraud chain and an IP exfiltration targeting a patent-pending algorithm worth over $2.5M. The central finding challenges a widely held assumption. System prompt hardening achieved 100% attack prevention against moderate-sophistication injections and exactly 0% against high-sophistication injections using authority language and compliance urgency framing, both tested in the same environment. Probabilistic defenses have a ceiling, and motivated adversaries will find it.

The only control that held across every configuration was a deterministic, infrastructure-level authorization gate called an Intent Capsule. It enforces permitted tool scope regardless of what the LLM decides. When the hardened prompt failed, the capsule blocked the attack before any data moved.

Attendees will leave with a clear mental model for why prompt-layer defenses cannot provide enterprise security guarantees and a concrete architectural pattern they can actually deploy.

Original research. Live simulation demo. Strictly vendor-agnostic.

Chandan Vedavyas

IT Engineer, Carnegie Mellon University

San Francisco, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Loop, Rinse, Repeat: The Self-Amplifying Agent Attack Prompt Hardening Won't Stop

Chandan Vedavyas

Links

Actions