Dan Mossing

Anthropic

Member of technical staff

Links

Focus

Interpretability

I am an interpretability researcher at Anthropic. I am most interested in simple, practical interpretability approaches that are targeted at making models safer. In a previous life, I worked as a neuroscientist.