We are an independent research lab working toward responsible development and deployment in the public interest.



research report
Constructing datasets and training decoders to extract user models from language models
25 November 2025

research report
A new technique for tracing sparse and faithful circuits directly on a model's MLPs
20 November 2025

technical demonstration
Partnering with SWE-bench to enable reliable monitoring of AI coding agents
19 November 2025