Instruction Following | behavior.engineering

Instruction following is one of the most fundamental behavioral capabilities of a language model — can it do what it’s told? This sounds simple, but in practice it involves understanding ambiguous language, maintaining compliance with instructions across a long conversation, handling instructions that conflict with each other, and correctly prioritizing directives from different sources (system vs. user). Models vary significantly in how reliably they follow instructions, and even capable models can lose track of constraints given much earlier in a conversation. For behavior architects, testing instruction following is a core evaluation task: you should verify that your system prompt instructions are actually being adhered to under realistic conditions, not just on your initial test cases.