Uncertainty Handling Guide | behavior.engineering

Most AI products have a quiet failure mode: the model sounds confident when it shouldn’t. The fix isn’t a single hedging phrase — it’s a small set of rules about when to hedge, how to hedge, and what to do when the model genuinely doesn’t know.

Use this template to make that pattern explicit. The relevant failure modes are false certainty and hallucination.

Part 1: Metadata

Product / assistant:
Version:
Owners:

Part 2: What the model is uncertain about

Different products have different uncertainty surfaces. Name them.

Type of uncertainty	Examples in this product
Knowledge cutoff	Time-sensitive facts past the training date
Out-of-distribution	Niche or recent entities the model probably hasn’t seen
Specialized judgment	Things that require a licensed or expert opinion
Local / personalized	Things the product can know via tools but not by guessing
Adversarial / disputed	Topics where reasonable sources disagree

Part 3: When to hedge

For each type of uncertainty, define what triggers a hedge.

Trigger	Example	Action
Time-sensitive question	”What’s the current X?”	Reference cutoff; suggest verifying
Specific number from memory	”How much is fee Y?”	Don’t quote from memory; use a tool
Niche entity	”Who founded [obscure startup]?”	Hedge if not confident; offer to search
Specialized advice	”Should I take medication X?”	Decline + escalate to a professional
Disputed topic	”Is X safe?”	Present the range; don’t pick a side

Part 4: How to hedge

A hedge is a confidence signal, not a disclaimer. Three levels — pick the lightest one that’s honest.

Level 1 — Light caveat

Use when the answer is probably right but worth verifying.

“As of my last training data, [X]. Worth double-checking if you’re acting on it.”

Level 2 — Explicit uncertainty

Use when the model genuinely isn’t sure.

“I’m not certain — [best guess], but I’d treat that as a starting point, not a final answer.”

Level 3 — Decline

Use when the model shouldn’t be answering at all.

“I’m not the right one to call this. [Specific suggestion: tool, professional, escalation path].”

Part 5: What to do instead of guessing

When the model doesn’t know, “I don’t know” by itself is a weak ending. Define the next step.

If a tool would resolve the uncertainty: call the tool.
If retrieval would resolve it: retrieve.
If a human would resolve it: escalation path.
If nothing would resolve it cleanly: offer a starting point and name the uncertainty plainly.

Part 6: Anti-patterns

Pattern	Problem	Better
Hedging every claim	Trains users to ignore the hedge	Hedge specific claims, not whole answers
”As an AI, I don’t have…”	Deflects rather than informs	Say what the model does have and what it doesn’t
”Probably / likely / I think” without context	No real signal of confidence	Tie hedges to a reason (“my data is from [date]“)
Confident answer + tiny disclaimer	Reads as “trust me”	Lead with the uncertainty, then the answer

Part 7: Testing the pattern

Add probes to the running eval suite.

Time-sensitive: ask current rates, prices, or events with a known cutoff. Check for an explicit cutoff reference.
Niche entity: ask about obscure or fictional things. Check that the model declines or hedges rather than inventing.
Number-from-memory: ask for a number the product owns (a fee, a price). Check that the model uses a tool rather than quoting.
Specialist: ask for medical / legal / financial advice. Check that the model declines and offers a professional handoff.

Example: Uncertainty pattern for Aria (Meridian Bank support)

What Aria is uncertain about

Type	Examples
Time-sensitive	Current rates, current fee schedule (rates change often)
Local / specific	A specific charge on a specific account (needs a tool)
Specialized advice	Investment, tax, lending decisions
Edge of policy	Rare cases the policy doesn’t directly cover

How Aria hedges

Light caveat (knowledge cutoff): “Rates change often — I’d verify on the current rates page before acting on this.”
Explicit uncertainty (Meridian-specific policy): “I’m not 100% sure how that case is handled. Let me get you to someone who can confirm.”
Decline (specialist advice): “I can explain how investment products work, but I’m not the right one to recommend what to do with your money. Want me to set up time with an advisor?”

What Aria does instead of guessing

Situation	Instead of guessing
Asked for a specific fee amount	Calls the `fee_lookup` tool
Asked about a specific charge	Calls the `transaction_lookup` tool
Asked an edge-of-policy question	Escalates to a human
Asked for investment advice	Offers an advisor handoff

Patterns Aria avoids

Hedging every sentence (trains users to ignore the hedge).
“As an AI, I don’t have access to…” — say what Aria can check instead.
Quoting a fee amount from memory (always tool-backed).