Glossary
Training Signal
Any information fed back to a model during training to indicate whether its behavior is on the right track.
A training signal is what tells the model how it’s doing — whether that’s a reward score, a human preference judgment, or a correction showing the right response. Without a training signal, the model has no way to improve; the quality and clarity of the signal directly determines what it learns. Weak or noisy signals — feedback that’s inconsistent, ambiguous, or misaligned with actual goals — can lead the model in the wrong direction even when training looks successful on the surface. For behavior architects, thinking critically about what signal your data actually provides (rather than what you intended it to provide) is one of the most valuable habits you can develop.