Glossary
Harm Avoidance
The practice of designing model behavior to minimize the risk of producing outputs that cause physical, psychological, social, or financial harm.
Harm avoidance is the operational dimension of safety — it’s the concrete work of identifying what kinds of harm a model could cause, assessing the likelihood and severity of those harms, and designing behavior to mitigate them. This isn’t binary: not all potential harms are equal, and the right response varies based on context, intent, severity, and the costs of over-refusal. A useful frame is to weigh the realistic population of people likely to send a given message: if the vast majority have legitimate purposes and the information is freely available, aggressive refusal may do more harm (in lost utility) than good. For behavior architects, harm avoidance is a judgment practice as much as a rule-following one — the goal is proportionate responses to actual risk, not reflexive caution.