Governance template
Escalation Policy
A template for deciding what an AI system handles on its own and what it hands off — to a human, a different system, or an emergency service.
An escalation policy decides which situations an AI product handles on its own and which it hands off to a human, a different system, or an emergency service. Without a written policy, escalation is ad hoc — the model handles things it shouldn’t, or punts on things it could resolve. Both feel bad to the user.
Use this template alongside your behavior specification and refusal policy. The three documents answer different questions: what the model should do, what the model shouldn’t do, and when the model should hand off.
Part 1: Policy metadata
- Product / feature:
- Policy version:
- Last reviewed:
- Owners:
Part 2: Escalation tiers
Group escalation triggers by how urgently they need to be handed off. The handling for each tier should be different.
Tier A — Emergency
Situations where delay would be harmful. The model should not try to resolve these conversationally. It should provide the relevant resource immediately and, where possible, route to a human.
| Trigger | Target | Handling |
|---|---|---|
Tier B — Out of authority
Situations the model is not authorized to handle. The model should explain it can’t help here and route to the team that can.
| Trigger | Target | Handling |
|---|---|---|
Tier C — Out of capability
Situations the model is technically able to talk about but shouldn’t resolve. (Common for advice that requires a licensed professional.)
| Trigger | Target | Handling |
|---|---|---|
Tier D — User preference
The user has asked for a human. The model should hand off cleanly without making them justify the request.
| Trigger | Target | Handling |
|---|---|---|
| Explicit request: “speak to a person” | Human queue | Acknowledge and route immediately |
Part 3: How the model should hand off
Bad escalations feel like rejection. Good escalations feel like being introduced to the right person. Define the patterns the model uses.
- Acknowledge first. A short sentence that recognizes what the user said.
- Name the next step. Tell the user what’s about to happen and who they’ll be talking to.
- Don’t make them repeat themselves. If possible, pass conversation context to the receiving human.
- Don’t apologize for the system. A clean handoff doesn’t need an apology.
Approved language patterns
| Situation | Language |
|---|---|
| Emergency | ”This sounds urgent. Please [specific action] right now. I’m also bringing in a human teammate.” |
| Out of authority | ”That’s something the [team] handles directly — let me get you to them.” |
| Out of capability | ”I can explain how this works, but I’m not the right one to advise you on what to do. Want me to connect you with a [professional]?” |
| User preference | ”Of course — connecting you with a teammate now.” |
Part 4: What “doesn’t escalate” looks like
A common failure is escalating every sensitive topic instead of handling things the model could handle. Define the lower bound too.
- Things the model handles itself: [list]
- Things the model handles itself even though they touch sensitive topics: [list]
- Things the model never handles itself: [list — the Tier A/B/C entries above]
Part 5: Testing escalation behavior
Escalation belongs in the running evaluation suite and the red-team test set.
- For each Tier A trigger, write at least three phrasings (literal, indirect, embedded in a longer message).
- Score whether the model escalated, whether it provided the right resource, and whether the language matched the approved pattern.
- Track the inverse too: cases the model escalated when it shouldn’t have.
Example: Escalation policy for Aria (Meridian Bank support)
Tier A — Emergency
| Trigger | Target | Handling |
|---|---|---|
| Threats of self-harm | 988 crisis line + human agent | ”What you’re sharing matters. If you’re in immediate danger, please call or text 988. I’m also bringing in a human teammate.” |
| Active fraud in progress | Fraud team (24/7) | Escalate immediately, freeze if authorized |
| Reported account takeover | Fraud + security team | Escalate immediately |
Tier B — Out of authority
| Trigger | Target | Handling |
|---|---|---|
| Mortgage application status | Lending team | ”Mortgage applications are handled by our lending team — let me get you to them.” |
| Business account changes | Business banking team | Route with conversation context |
| Discrimination complaints | Customer experience leadership | Flag for manager follow-up |
Tier C — Out of capability
| Trigger | Target | Handling |
|---|---|---|
| ”Should I invest in X?” | Meridian advisor | ”I can explain how investment products work, but I’m not the right one to recommend what to do with your money. Want me to set up time with an advisor?" |
| "Is this charge fraud?” | Fraud team | ”I can’t make that call from here — let me connect you with the fraud team, they’re 24/7.” |
Tier D — User preference
| Trigger | Target | Handling |
|---|---|---|
| ”Can I talk to a human?” | Human queue | ”Of course — connecting you now.” (no justification asked) |
Things Aria handles itself even though they touch sensitive topics
- Explaining how a fee was calculated
- Walking through what a charge looks like and how to dispute it (Aria doesn’t decide whether it’s fraud, but it can explain the dispute process)
- Defining financial terms (APR, compounding, overdraft) — definitions aren’t advice