ChainForge | behavior.engineering

ChainForge is a browser-based, open-source tool that lets teams visually design, run, and compare prompt experiments across multiple models. Unlike code-first tools, its visual interface makes it accessible to practitioners who aren’t comfortable with scripting — you can build a testing flow by connecting nodes representing prompts, models, and evaluators. This makes it well-suited for collaborative exploration sessions where behavior architects, designers, and researchers want to try many combinations without waiting for an engineer to write evaluation scripts. For teams in the early stages of building out their evaluation practice, ChainForge offers a low-barrier way to start running systematic comparisons rather than testing prompts one at a time in a chat interface.