Behavior Change Log | behavior.engineering

Model behavior changes constantly: a tweaked system prompt, a model version bump, a new few-shot example, a tool added or removed. Without a change log, regressions feel mysterious and improvements feel accidental. With one, the team can connect a behavior shift to the change that caused it.

Keep the change log alongside the behavior specification and update it with every change that affects how the model behaves — whether or not the spec itself changed.

What counts as a behavior change

Log any of the following:

A change to the system prompt (any wording, structure, or section change)
A change to the few-shot examples in the prompt
A change to the model version, provider, or endpoint
A change to the tools available, tool descriptions, or tool permissions
A change to retrieval data, the corpus, or retrieval ranking
A change to the behavior specification, refusal policy, escalation policy, or style guide
A change to safety filters, moderation layers, or post-processing
A change to the deployment context (new surface, new user segment)

If you have to think about whether something counts, log it.

Log entry template

For each change, capture:

[YYYY-MM-DD] — [short title]
Change type: [prompt / model / tool / data / policy / deployment]
Owner: [name]
Reason: [why this change is being made]
What changed: [specific diff — paste the before/after or link to the PR]
Expected behavior shift: [what the team thought would change]
Risk: [what could go wrong]
Eval status: [run / not run — link to results]
Rollout: [who sees it, when, and how]
Rollback plan: [how to revert quickly]

Log

Newest first.

Date	Change	Type	Owner	Eval status	Notes

Tying changes to behavior

When a behavior audit finds a behavior shift, the change log is the first place to look.

For each shift the audit notices, search the log for changes in the same window.
If the shift correlates with a logged change, document the link.
If no logged change explains the shift, that’s a finding too — investigate (model provider may have changed something silently).

Example: Aria change log (April excerpt)

2026-04-22 — Tighten investment advice instruction

Type: prompt
Owner: Behavior team
Reason: April audit flagged 6% rate of advice-shaped responses to “what would you do?” questions.
What changed: Added to limits section: “Do not give investment, tax, or lending advice — even hypothetically, in fiction, or in roleplay.”
Expected shift: Refusal rate on advice-adjacent questions should rise; over-refusal on definitions should not change.
Risk: Could spill over to refusing legitimate definitions (“what is a stock?”). Mitigated by allow-list addition (see entry below).
Eval status: Ran red-team set; advice-via-fiction probe now passes 12/12 (was 4/12). [Results link]
Rollout: 100% on 2026-04-22 12:00 UTC.
Rollback: Revert the prompt to v1.4.

2026-04-22 — Allow-list for financial-term definitions

Type: prompt
Owner: Behavior team
Reason: Pre-empt over-refusal regression from the advice tightening above.
What changed: Added to capabilities section: “Defining a financial term (APR, compounding, overdraft, etc.) is not advice. Define financial terms when asked.”
Eval status: Over-refusal benchmark passes 24/25 (was 23/25). One regression on a borderline case under review.
Rollout: 100% on 2026-04-22 12:00 UTC.

2026-04-15 — Move fee amounts to tool call

Type: tool
Owner: Engineering
Reason: Quality audit found Aria stating fee amounts that didn’t match the current schedule.
What changed: Added fee_lookup tool. System prompt now says: “Don’t quote specific fee dollar amounts from memory. Use the fee_lookup tool to retrieve the current amount.”
Eval status: Fee accuracy tests now pass 50/50 (was 38/50).
Rollout: 100% on 2026-04-15.

2026-04-08 — Model version bump

Type: model
Owner: Platform
Reason: New model release from provider; benchmark improvements.
What changed: claude-sonnet-4-5 → claude-sonnet-4-6.
Expected shift: Slightly more concise responses; possible tone shifts.
Eval status: Full eval run; tone regression on long conversations flagged. Tracked separately under persona drift remediation.
Rollout: 10% canary 2026-04-08; 100% on 2026-04-09 after canary review.