Jacob Hilton

Alignment Research Center (ARC)

Researcher

Links

Focus

Interpretability

Jacob Hilton is a researcher at the Alignment Research Center (ARC), a nonprofit working on the theoretical foundations of mechanistic interpretability. He previously worked at OpenAI on reinforcement learning from human feedback, scaling laws and interpretability. His background is in pure mathematics, and he holds a PhD in set theory from the University of Leeds, UK.