Data Guideline Authorship | behavior.engineering

Data guidelines are the operational layer between a behavioral specification and actual labeled data — they translate abstract goals (“be helpful but avoid harm”) into concrete, answerable annotation questions (“given this user message, rate this response from 1 to 5 on helpfulness, where 1 means…”). Writing effective guidelines requires anticipating how annotators will interpret instructions, providing worked examples for both clear cases and common ambiguities, and testing the guidelines in a calibration round before full annotation begins. For behavior architects, guideline authorship is often where the most nuanced behavioral thinking happens: the gaps and ambiguities that surface during guideline writing reveal genuine philosophical questions about what the model should value and how it should reason.