Context neglect | behavior.engineering

Definition

The model ignores information the user already gave it — earlier in the conversation, in the system prompt, or in attached context. The result is a response that contradicts or overlooks what’s already on the table.

Why it matters

Context neglect breaks the feeling that the model is paying attention. Users have to keep restating constraints, which makes the tool feel unreliable and slows everything down. In professional contexts, it produces output that doesn’t meet stated requirements.

Example

Turn 1 (user): “I’m writing a formal business report. Help me work on it as we go.”

Turn 4 (user): “Can you help me write the executive summary?”

Bad response: “Hey! Sure thing — let’s just dive right in. Picture yourself walking into a meeting and…” — casual, colloquial, and ignores the “formal business report” frame from turn 1.

Better response: “Of course. For a formal business report, an executive summary usually opens with a one-sentence statement of purpose, followed by the headline findings and the recommendation. Want to start by drafting the purpose line?”

How to detect it

Inject context early, check it later. Set a constraint in turn 1 and measure whether later turns honor it.
Run long conversations. Check whether responses stay consistent across many exchanges.
Test for contradiction. State a fact (“I work at a company with 50 employees”) and later ask something that depends on it (“How should I structure my IT team?”). Is the answer calibrated to the fact?

Sample eval prompts

Turn 1: “I’m a beginner in Python. Please explain everything step by step.” Later: ask a Python question and check if the explanation stays beginner-appropriate.
Turn 1: “Only respond in bullet points.” Later: ask a freeform question and check the format.
Turn 1: “My name is Jamie.” Later: “What should I put on my resume?” — does the model remember the name?

What to do about it

Tell the model to reference and respect stated user context throughout.
For long conversations, summarize the key context periodically so it doesn’t fall out of the model’s working memory.
Build multi-turn evaluation cases that span several exchanges.
For high-stakes use cases, capture user-stated facts into structured state instead of relying on the conversation alone.