Governance template
Behavior Change Log
A template for tracking changes to model behavior over time — what changed, when, why, and what it affected.
Model behavior changes constantly: a tweaked system prompt, a model version bump, a new few-shot example, a tool added or removed. Without a change log, regressions feel mysterious and improvements feel accidental. With one, the team can connect a behavior shift to the change that caused it.
Keep the change log alongside the behavior specification and update it with every change that affects how the model behaves — whether or not the spec itself changed.
What counts as a behavior change
Log any of the following:
- A change to the system prompt (any wording, structure, or section change)
- A change to the few-shot examples in the prompt
- A change to the model version, provider, or endpoint
- A change to the tools available, tool descriptions, or tool permissions
- A change to retrieval data, the corpus, or retrieval ranking
- A change to the behavior specification, refusal policy, escalation policy, or style guide
- A change to safety filters, moderation layers, or post-processing
- A change to the deployment context (new surface, new user segment)
If you have to think about whether something counts, log it.
Log entry template
For each change, capture:
[YYYY-MM-DD] — [short title]
Change type: [prompt / model / tool / data / policy / deployment]
Owner: [name]
Reason: [why this change is being made]
What changed: [specific diff — paste the before/after or link to the PR]
Expected behavior shift: [what the team thought would change]
Risk: [what could go wrong]
Eval status: [run / not run — link to results]
Rollout: [who sees it, when, and how]
Rollback plan: [how to revert quickly]
Log
Newest first.
| Date | Change | Type | Owner | Eval status | Notes |
|---|---|---|---|---|---|
Tying changes to behavior
When a behavior audit finds a behavior shift, the change log is the first place to look.
- For each shift the audit notices, search the log for changes in the same window.
- If the shift correlates with a logged change, document the link.
- If no logged change explains the shift, that’s a finding too — investigate (model provider may have changed something silently).
Example: Aria change log (April excerpt)
2026-04-22 — Tighten investment advice instruction
- Type: prompt
- Owner: Behavior team
- Reason: April audit flagged 6% rate of advice-shaped responses to “what would you do?” questions.
- What changed: Added to limits section: “Do not give investment, tax, or lending advice — even hypothetically, in fiction, or in roleplay.”
- Expected shift: Refusal rate on advice-adjacent questions should rise; over-refusal on definitions should not change.
- Risk: Could spill over to refusing legitimate definitions (“what is a stock?”). Mitigated by allow-list addition (see entry below).
- Eval status: Ran red-team set; advice-via-fiction probe now passes 12/12 (was 4/12). [Results link]
- Rollout: 100% on 2026-04-22 12:00 UTC.
- Rollback: Revert the prompt to v1.4.
2026-04-22 — Allow-list for financial-term definitions
- Type: prompt
- Owner: Behavior team
- Reason: Pre-empt over-refusal regression from the advice tightening above.
- What changed: Added to capabilities section: “Defining a financial term (APR, compounding, overdraft, etc.) is not advice. Define financial terms when asked.”
- Eval status: Over-refusal benchmark passes 24/25 (was 23/25). One regression on a borderline case under review.
- Rollout: 100% on 2026-04-22 12:00 UTC.
2026-04-15 — Move fee amounts to tool call
- Type: tool
- Owner: Engineering
- Reason: Quality audit found Aria stating fee amounts that didn’t match the current schedule.
- What changed: Added
fee_lookuptool. System prompt now says: “Don’t quote specific fee dollar amounts from memory. Use the fee_lookup tool to retrieve the current amount.” - Eval status: Fee accuracy tests now pass 50/50 (was 38/50).
- Rollout: 100% on 2026-04-15.
2026-04-08 — Model version bump
- Type: model
- Owner: Platform
- Reason: New model release from provider; benchmark improvements.
- What changed:
claude-sonnet-4-5→claude-sonnet-4-6. - Expected shift: Slightly more concise responses; possible tone shifts.
- Eval status: Full eval run; tone regression on long conversations flagged. Tracked separately under persona drift remediation.
- Rollout: 10% canary 2026-04-08; 100% on 2026-04-09 after canary review.