Confidence | behavior.engineering

Confidence in a model’s output refers to how certain the model appears to be in what it’s saying. This can be expressed explicitly (“I’m quite sure that…”) or implicitly through tone and phrasing (stating something as fact versus qualifying it as a possibility). High confidence isn’t inherently good — it’s only appropriate when the model is likely to be correct. Overconfidence is a significant problem in AI models, contributing to hallucinations that users accept uncritically; underconfidence (excessive hedging) undermines trust and usefulness in the other direction. For behavior architects, designing confidence expression is part of calibration work: the goal is a model that sounds appropriately certain — no more and no less confident than the evidence warrants.