
Anthropic
—
Member of technical staff
Links
Focus
Interpretability
Stream
Dan Mossing
I am an interpretability researcher at Anthropic. I am most interested in simple, practical interpretability approaches that are targeted at making models safer. In a previous life, I worked as a neuroscientist.