We are an independent research lab working toward responsible development and deployment in the public interest.
technical demonstration
A system for analyzing and intervening on agent behavior
24 March 2025
research report
Open-source AI systems trained to describe components of other AI systems at the level of a human expert
23 October 2024
research report
Language models trained to automatically surface harmful behaviors in language models
23 October 2024