
Google DeepMind
—
Research scientist
Links
Focus
Scheming and Deception, Control, Red-Teaming
Stream
Google DeepMind
I am a research scientist on the AGI Safety & Alignment team at Google DeepMind. I focus on deceptive alignment and AI control, particularly scheming propensity evaluations. My past research includes dangerous capability evals, power-seeking incentives, specification gaming, and avoiding side effects.