Stephen Casper

This stream will focus on impact-oriented technical AI governance research work, potentially including research on open-weight models, applied AI safeguards research, AI incidents, technically rigorous AI policy, etc. 

Stream overview

Our stream projects will generally focus on a few types of topics:

  • Open-weight model safeguards: working to make AI systems with publicly downloadable weights more resistant to misuse, including by making them more tamper-resistant. 
  • Applied AI safeguards research: studying if and how AI safeguards are applied in the real world, and analyzing the connections that exist between company choices and downstream consequences. 
  • AI incidents: studying AI incidents and how they could be prevented. 
  • Technical rigor of AI policy: auditing laws for technical ambiguities, challenges, and loopholes. 
  • Miscellaneous AI governance research: guerrilla-style research to help policymakers make informed choices about emerging challenges in AI. 

Mentors

Stephen Casper (Cas)
Harvard
,
Assistant Professor
Boston
Adversarial Robustness
Policy and Governance
Red-Teaming
Safeguards

Stephen (“Cas”) Casper) is a computer scientist and Assistant Professor of Public Policy at the Harvard Kennedy School. He has formerly worked at MIT and the UK AI Security Institute. His work focuses on AI safeguards and technical governance. His research has been featured at NeurIPS, AAAI, Nature, FAccT, EMNLP, SaTML, TMLR, IASEAI, several course curricula, a number of workshops, and over 20 news articles and newsletters. He is also a writer for the International AI Safety Report and the Singapore Consensus. In addition to MATS, he also mentors for ERA and GovAI. In the past, he has worked closely with over 30 mentees on AI safety- and governance-related research projects.

Read more

These projects in this stream will likely follow a certain default research process:

  1. Pay attention to and discuss contemporary discussions, debates, and proposals related to AI governance.
  2. Get disappointed, confused, or frustrated.
  3. Write a technical paper to improve the discussion. 
  4. Spend a lot of time and effort communicating it to the audience that needs it. 

Mentorship style

By default, we should expect to meet 2-3 times per week as a full group, plus ad hoc project-specific meetings. 

Fellows we are looking for

Green flags include:

  • Research tenacity: demonstrated ability to pursue self-directed work, make things happen through determination, teach oneself whatever skills are required for a project, and succeed even when not set up to succeed. As an example, I think it is a strong green flag when an undergrad pursues side projects in a self-directed manner rather than only pursuing projects under classes, internships, jobs, etc. 
  • Research taste: the AI research space is noisier than ever, and almost all AI research has little to no practical value. Putting impact over interest and designing projects around a specific plan for impact is the most important single skill needed for good AI work. 
  • Experience across most or all of the full project stack: ideating, planning, experimenting, writing, and publishing. 

This stream will follow an academic collaboration model. Scholars will be free to discuss and collaborate externally. However, scholars should also expect to work in collaboration with others in the stream.

Project selection

I will work with MATS scholars to iteratively refine project ideas in whatever area our interests and skills overlap. Above all, project selection will hinge on having a clear (and good) theory of impact.