MATS Fellow:
Isha Gupta
Authors:
Isha Gupta, Kai Fronsdal, Abhay Sheshadri, Jonathan Michala, Jacqueline Tay, Rowan Wang, Samuel R. Bowman, Sara Price
Citations
Abstract:
We are releasing Bloom, an agentic framework for developing behavioral evaluations. Bloom's evaluations are reproducible and targeted: unlike open-ended auditing, Bloom takes a researcher-specified behavior and quantifies its frequency and severity across automatically generated scenarios. Bloom's evaluations correlate strongly with our hand-labelled judgments and reliably separate baseline models from intentionally misaligned ones. As examples, we also release benchmark results for four alignment relevant behaviors on 16 models. Bloom is available at github.com/safety-research/bloom.
Bloom: an open source tool for automated behavioral evaluations
Authors:
Isha Gupta
Date:
December 19, 2025
Citations:
0
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
Authors:
Adam Karvonen
Date:
December 17, 2025
Citations:
0
The MATS Program is an independent research and educational initiative connecting emerging researchers with mentors in AI alignment, governance, and security.
Each MATS cohort runs for 12 weeks in Berkeley, California, followed by an optional 6–12 month extension in London for selected scholars.