
Anthropic
—
Member of Technical Staff
Links
Focus
Control, Model Organisms, Red-Teaming, Scheming and Deception
Stream
Anthropic
Joe is a member of the Alignment Science team at Anthropic. He's currently working on scalable oversight and also has interests in control, chain-of-thought monitoring, and alignment evaluations. For some examples of recent projects, including MATS collaborations, see: https://joejbenton.com/research/.