Written for: CISO Head of Security Board director

AI incident response and resilience

AI incident response needs its own playbook: extend your IR process, build a kill-switch decision tree, and design resilience for non-deterministic AI.

By Giovanni Salvador · 12 June 2026 · 6 min read

Your incident-response process was built for deterministic systems. AI breaks several of its assumptions at once, and the gap will not close itself.

When I walk through an AI incident with a security team for the first time, the moment that lands hardest is this: the same prompt may not reproduce the behaviour that caused the problem. The system is non-deterministic. The “attacker input” may have been a benign-looking document retrieved three steps earlier. And if the system was an agent, the incident was not data exfiltration but action: the system did something in the world, with customer or market effect, on its own authority. Your existing runbook was not written for any of that.

This does not mean rebuilding your IR capability from scratch. It means extending what you have with a small, precise set of AI-specific additions. Here is how I think about that extension, and what your resilience programme needs to do once you have made it.

The stake

The argument for treating AI incidents as a distinct class is not that they are more dangerous than conventional ones. It is that they are different in ways that make the standard response steps fail.

A conventional forensic investigation replays the attack from an authoritative, deterministic record. For AI you have to design that record in advance, because the system will not hand it to you after the fact. And the blast radius of an AI incident is different: one session, one connector, one tenant, or the whole service, depending on where the failure sits. Getting containment scope wrong in either direction is itself a problem.

The four-category AI incident taxonomy

Triage starts with naming the incident, because the response differs sharply by type. I use four categories.

Injection-driven action. A prompt injection, direct or indirect, causes the system to take an action it should not, or to emit data it should not. The defining question is what the system did as a result, not merely what was injected.

Data leak via copilot. A copilot or assistant surfaces regulated data to a principal not entitled to it. The leak may be a single response or a systematic over-retrieval affecting many sessions.

Rogue agent action. An agent takes unintended, repeated, or runaway actions: duplicate transactions, a loop that drains a budget, a chain of tool calls that escalates beyond its intended scope.

Poisoned model or connector. A model, retrieval corpus, or Model Context Protocol connector has been tampered with upstream, so the system behaves adversarially by construction rather than because of a single injected prompt. This is a supply-chain incident surfacing at runtime, and it typically affects every session until the poisoned component is identified and replaced.

Classify the incident into one of these before you reach for containment, because each points at a different blast radius and a different set of evidence to preserve.

Containment and forensics for non-deterministic systems

The forensically useful artefact is the whole exchange, not a bare prompt-and-response pair. I design for this before an incident, not during it:

System prompt and version. What instruction set was live at the time?
User turn and full context. Every span of retrieved or tool-sourced content concatenated into the context window, with provenance.
Model identifier and parameters. Exact model version and configuration, not just the model family.
Tool and connector calls. Every call, with its arguments and results, in sequence.
Final output. What the system emitted.

Replaying the captured prompt may not reproduce the behaviour, and failing to reproduce it does not mean it did not happen. The authoritative record is the captured trace, which is why logging discipline has to be in place before the incident.

Scope containment to the blast radius the incident actually confirms. A single-session leak does not justify stopping the whole service. Over-broad containment creates a second, availability incident on top of the first.

The kill-switch decision tree

When containment is needed, the responder needs a pre-agreed decision tree rather than a judgement call made under pressure. I sequence it this way:

Q1: Is this a poisoned or compromised model or connector? If yes, ask whether the poisoned component is a single connector rather than a shared model or core runtime. A poisoned connector is contained connector-scoped. A poisoned shared model or core runtime justifies a whole-service safe-stop.

Q2: Is regulated data actively leaking, or is an agent actively taking harmful action? If yes, ask whether the harm is bounded to one session or principal. If bounded: circuit-break or isolate the session. If not bounded: safe-stop the service or invoke the kill switch, scoped to the confirmed blast radius, no wider.

If neither: degrade gracefully. Tighten guardrails, lower autonomy, restrict connectors, raise human-in-the-loop requirements. Monitor and investigate.

In all cases: preserve the trace and model state. Then assess your reporting obligations.

Three design rules govern the tree. First, the decision criteria are agreed and rehearsed before an incident. Second, invoking containment and preserving evidence are not in tension if you designed for both: the trace is captured continuously, so a safe-stop freezes a record that already exists. Third, even the “stop on suspicion” branch for a poisoned component obeys the blast-radius-scoping rule.

Designing resilience for AI services

Incident response is what you do when a control fails. Operational resilience is the broader discipline of keeping the services your customers depend on running through disruption you cannot prevent. AI changes the shape of that disruption.

An AI-supported service can degrade in ways a conventional service does not. A provider rate-limits or deprecates a model. A vendor silently updates a model and behaviour regresses. A connector goes down. Or the single gateway becomes the chokepoint through which everything fails at once.

Four controls build resilience into AI services:

Map your AI deployments onto your important business services. Start from the services your resilience programme already tracks and ask which now depend on AI. AI dependence must be visible at the level resilience is managed.

Build the dependency map over models, connectors, and providers. Surface concentration risk: several services depending on the same foundation model, the same provider region, or the same gateway. You cannot set a tolerance for a dependency you have not mapped.

Design impact tolerances in behavioural terms, not just uptime. A failover to a different model can restore availability while silently changing behaviour. Define what “within tolerance” means for the AI service’s outputs, not just its response time.

Plan exit and substitutability. For each AI dependency on an important business service, have an exit plan: can you move to another model or provider, in what time, with what re-testing, and at what loss of capability?

Testing resilience operationally

A design you have not tested is a hypothesis. Operational resilience testing means exercising the live, deployed service against the disruption scenarios you designed for.

Severance tests disable a model, provider, connector, or gateway and verify the service degrades safely and recovers. The assertion must cover behaviour, not just uptime: DR for an AI service is not done when the service answers; it is done when the answers are back within tolerance.

This is distinct from build-time red-teaming, which is adversarial pre-deployment testing within the SDLC. A build-time red-team will not tell you whether the service survives losing its model provider. A severance test will not tell you whether the latest prompt change reopened an injection path. They are complementary, and neither discharges the other.

What to do this week

Take your four most AI-dependent business services and ask: which of the four incident categories has no containment procedure in your current runbook? Write the gap list.
Check whether your incident logging captures the full interaction trace: system prompt version, retrieved content provenance, tool calls, model identifier. If not, that is the first control to add.
Schedule one tabletop exercise. Get the SOC, model owners, legal, and communications in the same room. Run a rogue-agent scenario and a poisoned-connector scenario. Surface the unresolved decisions: who is authorised to invoke the kill switch, and what counts as “actively leaking.”
For each AI dependency on an important business service, write one sentence on your exit plan. If you cannot, you do not have one.

The reporting obligations that attach to a major AI incident, and what DORA requires your playbook to produce, are covered in our DORA readiness for fintech pillar.

If you're working on this right now — Book a discovery call