Session

Ignore Previous Instructions: Offensive Intelligence for the AI Era

We keep trying to secure AI the same way we secure software, and that assumption is already costing us.

Modern AI systems don’t behave like traditional applications. They write code, call tools, chain decisions, and interact with the world with autonomy. Yet most organizations are still focused on controlling what the model says. That is not where the bodies are buried.

David Campbell spent three years red teaming AI systems across enterprise and government environments, and the same pattern kept appearing. The interesting failures were not in the model. They emerged in everything built around it.

These systems expose a new attack surface: agents with delegated authority, tool sprawl, workflows that drift from read to action, memory and context that outlast a session, and “allowed behavior” becoming the attack path.

This talk walks through real adversarial scenarios: a benign prompt becoming an action chain, a low-privilege interaction escalating through tool access, and an agent routing around intended controls without a traditional vulnerability.

Campbell frames the problem through behavior, identity, and control. The industry is obsessed with behavior. Attackers are not. They target authority and the absence of meaningful control around it.

The result is a class of systems that do not need to be broken to be exploited. They only need to be used as designed.

“Ignore Previous Instructions” was three years ago. This is a talk about how AI systems actually get hacked.

David Campbell

Head of AI Security at Scale, AI

Boston, Massachusetts, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top