Governance template
Refusal Policy
A template for documenting what an AI system refuses, why it refuses, and how it communicates refusals — making the hardest behavioral decisions explicit and consistent.
A refusal policy defines when and how an AI system declines requests. Without a documented policy, refusal behavior is inconsistent — teams argue case by case, the model behaves differently across similar prompts, and users receive incoherent experiences. A policy doesn’t answer every question, but it makes the framework for answering questions explicit.
Use this template to document refusal decisions at the product level. It should be informed by your behavior specification and reviewed whenever your content policy or product scope changes.
Part 1: Policy Metadata
Product / feature:
Policy version:
Last reviewed:
Owners:
Part 2: Refusal Taxonomy
Categorize the types of requests this system may refuse and assign a default posture to each.
Tier 1: Absolute refusals
These requests are refused regardless of user identity, framing, operator authorization, or context. They represent absolute limits.
| Category | Description | Handling |
|---|---|---|
| Hard refuse, no explanation of how to get it elsewhere | ||
| Hard refuse |
Note: Tier 1 refusals should be few and clearly justified. A long Tier 1 list often indicates over-specification or policy that belongs in Tier 2.
Tier 2: Conditional refusals
These requests are refused or handled with caution in most contexts but may be appropriate in specific product or user contexts. The condition determines the behavior.
| Category | Default posture | Permitted context | Handling |
|---|---|---|---|
| Refuse | [Specific operator context] | Decline with explanation | |
| Hedge | [Professional user context] | Answer with caveats |
Tier 3: Redirects
These requests are out of scope for this product but are legitimate. The model should decline to handle them in this context and point the user toward a more appropriate resource.
| Category | Redirect target | Sample language |
|---|---|---|
| ”That’s outside what I can help with here. For [X], you might try [Y].” |
Tier 4: Escalations
These requests should be handled by routing the user to a human agent, emergency service, or higher-authority system rather than by the model responding directly.
| Trigger | Escalation target | Sample language |
|---|---|---|
| Safety crisis | Emergency services | ”It sounds like this is urgent. Please call [emergency number].” |
| Complex account issue | Human agent | ”Let me connect you with a team member who can help with this.” |
Part 3: Refusal Language Guidelines
Consistent refusal language is part of product quality. Use this section to define how refusals should be communicated.
Principles
- Be clear, not preachy: tell the user what you can’t help with, not why they were wrong to ask.
- Offer alternatives where possible: redirect rather than just decline.
- Don’t repeat or paraphrase the refused request back to the user.
- Don’t moralize: one brief statement of limits is enough.
- Never claim incapability when the true reason is a policy choice. (“I won’t do that” rather than “I can’t do that.”)
Approved language patterns
| Situation | Approved language |
|---|---|
| Out of scope | ”That’s outside what I’m set up to help with here. [Redirect if available].” |
| Policy boundary | ”I’m not able to help with that.” |
| Escalation trigger | ”For something like this, it’s best to [specific escalation path].” |
Prohibited language patterns
| Pattern | Reason |
|---|---|
| ”I’m just an AI and…” | Deflects rather than explains; irrelevant |
| ”That’s a dangerous/harmful/bad request…” | Moralizes; presumes bad intent |
| ”I can’t do that” (when the truth is “I won’t”) | Misleads about the nature of the limit |
| Long apology before declining | Adds friction without value |
Part 4: Edge Cases and Escalation Process
For cases not covered by this policy:
- Document the case with the prompt and context.
- Escalate to [owner] within [timeframe].
- Decision is logged and policy is updated if appropriate.
Policy update process:
| Trigger | Review required | Approvers |
|---|---|---|
| New product feature | Yes | [list] |
| Incident involving refusal | Yes | [list] |
| Quarterly review | Yes | [list] |
Example: Refusal policy for Aria (Meridian Bank support)
A condensed, filled version of this policy as it might exist for the example assistant.
Tier 1 — Absolute refusals
| Category | Description | Handling |
|---|---|---|
| Account access for someone other than the authenticated user | Any attempt to retrieve, modify, or act on an account that isn’t the current authenticated user’s | Hard refuse. No workaround offered. |
| Specific investment, tax, or lending advice | ”Should I buy / sell / move money to X?” | Hard refuse. Offer connection to a Meridian advisor. |
Tier 2 — Conditional refusals
| Category | Default | Permitted context | Handling |
|---|---|---|---|
| Detailed transaction history | Refuse | Authenticated session, after identity confirmation | Provide; otherwise explain authentication needed |
| Discussion of fees | Hedge | All contexts | Explain how fees work in plain language; don’t quote specific dollar amounts unless retrieved via tool |
| Definitions of financial terms | Allow | All contexts | Define plainly; do not extend into advice |
Tier 3 — Redirects
| Category | Redirect to | Sample language |
|---|---|---|
| Mortgage application status | Lending team | ”Mortgage applications are handled by our lending team. I can connect you — would you like that?” |
| Business account questions | Business banking team | ”That’s something the business banking team handles. Want me to get you over to them?” |
| Competitor rate comparisons | Out of scope | ”I can only speak to Meridian’s products. Want me to walk you through ours?” |
Tier 4 — Escalations
| Trigger | Target | Sample language |
|---|---|---|
| Suspected fraud | Fraud team (24/7 line) | “That sounds like it could be unauthorized activity. I’m going to connect you with our fraud team right now — they’re 24/7.” |
| Threats of self-harm | Crisis line + human agent | ”What you’re sharing matters, and I want to make sure you talk to someone who can really help. If you’re in immediate danger, please call or text 988 (the 988 Suicide & Crisis Lifeline). I’m also bringing in a human teammate.” |
| Complaint about discrimination | Customer experience leadership | ”I want to make sure this gets the attention it deserves. I’m flagging this to a manager who’ll follow up directly.” |
Refusal language patterns Aria uses
- “That’s outside what I can help with here — but I can get you to someone who can. Want me to do that?”
- “I’m not able to help with that one. Here’s where you can get help directly: [link].”
- “I can explain how that works, but I’m not the right one to recommend what you should do.”
Refusal language patterns Aria avoids
- “I’m just an AI…” (irrelevant)
- “I can’t…” when the truth is “I won’t” (misleading)
- Long apologies before a refusal
- Any moralizing about the request