MATS Mentors

Security Engineering Manager

—

Technical Lead of the Infrastructure Security Engineering Team @ Anthropic. Implementing SL4/5 and searching for differentially defense-favored security tools.

Focus:

Empirical

Security, Compute and Hardware

Programs:

Member of Technical Staff

—

Focus:

Empirical

Control, Model Organisms, Scheming and Deception, Strategy and Forecasting

Programs:

Summer 2026

Daniel Kang

UIUC

Professor

—

Daniel is a professor of computer science at UIUC, where he studies the progress of AI, with a particular focus on dangerous capabilities of AI agents. His work includes:

- CVE-Bench, an award winning benchmark (SafeBench award, ICML spotlight) that is used by frontier labs and governments to measure AI agents' ability to find and exploit real-world vulnerabilities.

- Agent Benchmark Checklist, an award winning work (Berkeley AI summit, 1st place Benchmarks & Evaluations track) that highlights major issues in existing benchmarks.

- InjecAgent, one of the first AI agent safety benchmarks, used by governments and major labs.

Focus:

Empirical

Biorisk, Security, Dangerous Capability Evals

Programs:

Member of Technical Staff

—

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Summer 2026

Roger Grosse

Anthropic

Associate Professor

—

Focus:

Empirical

Interpretability

Programs:

Summer 2026

Xander Davies

UK AISI

Safeguards Team Lead

—

Xander Davies is a Member of the Technical Staff at the UK AI Security Institute, where he leads the Red Teaming group, which uses adversarial ML techniques to understand, attack, and mitigate frontier AI safeguards. He is also a PhD student at the University of Oxford, supervised by Dr. Yarin Gal. He previously studied computer science at Harvard, where he founded and led the Harvard AI Safety Team.

Focus:

Empirical

Monitoring, Adversarial Robustness, Control, Model Organisms, Red-Teaming, Dangerous Capability Evals, Safeguards

Programs:

Summer 2026

Maksym Andriushchenko

ELLIS Institute Tübingen

Principal Investigator (AI Safety and Alignment Group)

—

I am a principal investigator at the ELLIS Institute Tübingen and the Max Planck Institute for Intelligent Systems, where I lead the AI Safety and Alignment group. I also serve as chapter lead for the new edition of the International AI Safety Report chaired by Prof. Yoshua Bengio. I have worked on AI safety with leading organizations in the field (OpenAI, Anthropic, UK AI Safety Institute, Center for AI Safety, Gray Swan AI). I obtained my PhD in machine learning from EPFL in 2024 advised by Prof. Nicolas Flammarion. My PhD thesis was awarded the Patrick Denantes Memorial Prize for the best thesis in the CS department of EPFL and was supported by the Google and Open Phil AI PhD Fellowships.

Focus:

Empirical

Dangerous Capability Evals, Agent Foundations, Adversarial Robustness, Monitoring, Scalable Oversight, Scheming and Deception

Programs:

Summer 2026

Neev Parikh

METR

Member of Technical Staff

—

I like to make computers do interesting things, deeply understand concepts and build interesting, useful tools. I’m currently thinking about AI alignment, control, and evaluations, and work with frontier models at METR.

Recent work I've done involves MALT, training models to fool monitors in QA settings and RE-Bench.

I've previously worked at Stripe and CSM, and did a concurrent BSc/MSc in Computer Science at Brown.

Focus:

Empirical

Dangerous Capability Evals, Red-Teaming, Model Organisms, Control, Monitoring

Programs:

Summer 2026

Sarah Schwettmann

Transluce

Co-Founder, Chief Scientist

—

I’m a Research Scientist in MIT CSAIL with the MIT-IBM Watson AI Lab. I did my PhD in Brain and Cognitive Sciences at MIT, as an NSF Fellow working with Josh Tenenbaum and Antonio Torralba. My work investigates representations underlying intelligence in artificial (and previously, biological) neural networks.

Focus:

Empirical

Interpretability, Monitoring, Dangerous Capability Evals

Programs:

Researcher, Executive Director

—

Jacob Hilton is a researcher and the executive director at the Alignment Research Center (ARC), a nonprofit working on the theoretical foundations of mechanistic interpretability. He previously worked at OpenAI on reinforcement learning from human feedback, scaling laws and interpretability. His background is in pure mathematics, and he holds a PhD in set theory from the University of Leeds, UK.

Focus:

Theory

Interpretability

Programs:

Summer 2026

Eli Lifland

AI Futures Project

Researcher

—

Eli is working on AI scenario forecasting with the AI Futures Project, where he co-authored AI 2027. He advises Sage, an organization he cofounded that works on AI Digest (interactive AI explainers) and forecasting tools. He previously worked on the AI-powered research assistant Elicit.

Focus:

Policy and Strategy

Strategy and Forecasting, Policy and Governance

Programs:

Member of Policy Staff

—

Michael Chen works on AI policy at METR and is a part-time PhD student at Oxford in technical AI governance. Michael previously worked as a software engineer at Stripe. METR's policy team has assisted companies like Google DeepMind, Amazon, and Anthropic with developing their frontier safety policies – voluntary commitments to evaluate and mitigate severe AI risks. Besides corporate advising, Michael has provided feedback on U.S. state bills and the EU AI Act GPAI Code of Practice.

Focus:

Technical Governance

Dangerous Capability Evals, Policy and Governance

Programs:

Member of Technical Staff

—

I’m a Member of Technical Staff at OpenAI working on monitoring LLM agents for misalignment. Previously, I worked on AI control and safety cases at the UK AI Security Institute and on honesty post-training at Anthropic. Before that, I did a PhD at the University of Sussex with Chris Buckley and Anil Seth focusing on RL from human feedback (RLHF) and spent time as a visiting researcher at NYU working with Ethan Perez, Sam Bowman and Kyunghyun Cho.

Focus:

Empirical

Control, Monitoring, Dangerous Capability Evals

Programs:

Summer 2026

Fynn Heide

Safe AI Forum

Executive Director

—

Fynn Heide is the Executive Director of Safe AI Forum. He studied at the University of Warwick and has done research on China, and AI Governance.

Focus:

Policy and Strategy

Policy and Governance

Programs:

Summer 2026

Stephen Casper (Cas)

MIT CSAIL

PhD Candidate

—

Stephen (“Cas”) Casper is a final year Ph.D student at MIT in the Algorithmic Alignment Group advised by Dylan Hadfield-Menell. His work focuses on AI safeguards and technical governance. His research has been featured at NeurIPS, AAAI, Nature, FAccT, EMNLP, SaTML, TMLR, IRAIS, several course curricula, a number of workshops, and over 20 news articles and newsletters. He is also a writer for the International AI Safety Report and the Singapore Consensus. In addition to MATS, he also mentors for ERA and GovAI. In the past, he has worked closely with over 30 mentees on various safety-related research projects.

Focus:

Technical Governance

Adversarial Robustness, Policy and Governance, Red-Teaming, Safeguards

Programs:

Research Scientist

—

I am a research scientist on the AGI Safety & Alignment team at Google DeepMind. I am currently focusing on deceptive alignment and AI control (recent work: https://arxiv.org/abs/2505.01420), particularly scheming propensity evaluations and honeypots. My past research includes power-seeking incentives, specification gaming, and avoiding side effects.

Focus:

Empirical

Scheming and Deception, Dangerous Capability Evals, Control, Red-Teaming

Programs:

Researcher

—

Eric Neyman is a researcher at the Alignment Research Center (ARC), which is working on a systematic and theoretically grounded approach to mechanistic interpretability. Before joining ARC, he was a PhD student at Columbia University, where he researched algorithmic Bayesian epistemology.

Focus:

Theory

Interpretability

Programs:

Summer 2026

Yafah Edelman

Epoch AI

Head of Data & Trends

—

Yafah Edelman is the head of the data team at Epoch AI. She researches the inputs that allow AI to scale, as well its impacts.

Focus:

Compute Infrastructure

Compute and Hardware

Programs:

Summer 2026

He He

New York University

Associate Professor

—

He He is an associate professor at New York University. She is interested in how large language models work and potential risks of this technology.

Focus:

Empirical

Monitoring, Dangerous Capability Evals, Scalable Oversight, Safeguards

Programs:

Summer 2026

Mauricio Baker

RAND

Technical AI Policy Research Scientist, PhD student

—

Mauricio researches AI policy at RAND. His work has focused on AI hardware governance, especially export controls and verification of international agreements on AI. He’s more broadly interested in technical AI governance, and in studying policy options the field might be overlooking. Previously, Mauricio contracted with OpenAI and did a masters in Computer Science at Stanford University.

Focus:

Technical Governance

Compute and Hardware, Policy and Governance

Programs:

Summer 2024

Summer 2026

MATS mentors are advancing the frontiers of AI alignment, transparency, and security

Frequently asked questions