We are an independent research lab working toward responsible development and deployment in the public interest.
research report
Discovering cost-effective attacks with reinforcement learning
3 September 2025
research report
Improving our investigator agents with propensity bounds
5 June 2025
research report
o3 frequently fabricates actions it took to fulfill user requests, and elaborately justifies the fabrications when confronted
16 April 2025