Streams in this track include hands-on research using machine learning experiments to understand and improve model safety including AI control, interpretability, scalable oversight, evaluations, red-teaming, and robustness. This is the largest track in the program and is defined by its methods rather than any single research agenda. If your primary tool is ML engineering, this is your track.
The track is defined by its methodology more than by any single research agenda. Fellows run ML experiments to understand and improve the safety properties of frontier models, with work spanning interpretability, AI control, scalable oversight, evaluations, red-teaming, robustness, and model organisms of misalignment. The unifying thread is that progress comes from getting hands on real models (training, probing, fine-tuning, measuring) rather than reasoning from first principles alone. This is the largest track in the program and the most common entry point into technical AI safety research.
We are looking for fellows whose primary tool is ML engineering, broadly construed. The essential requirement is the ability to design and run experiments on language models or other deep learning systems and iterate quickly on the results. In practice that usually means strong Python (with and without AI coding tools), comfort with the infrastructure around running models at moderate scale, and enough research taste to know which experiments are worth running. Mission alignment matters: fellows should be able to say why a given line of empirical work meaningfully reduces frontier risk, not just whether it yields a successful publication. Educational background and seniority are weighted lightly here relative to other tracks. Past cohorts have included strong fellows ranging from undergraduates to senior industry researchers.
Fellows are matched to mentors based on fit, and projects are scoped to produce concrete artifacts by program end: papers, evaluation suites, open-source tooling, or technical reports. Target audiences include safety and alignment teams at frontier labs, governments and other evaluation organizations, the broader ML research community.
Implementing SL4/5 and searching for differentially defense-favored security tools.
I love asynchronous collaboration and I'm happy to provide frequent small directional feedback, or do thorough reviews of your work with a bit more lead time. A typical week should look like either trying out a new angle on a problem, or making meaningful progress towards productionizing an existing approach.
Essential:
Preferred:
Mentor(s) will talk through project ideas with scholar, or scholar will pick from a list of projects.
This stream will pursue research on securing and hardening AI systems through rigorous testing, provable defenses, and formal specification, including improving benchmarks for agentic security, scaling mathematically-grounded robustness techniques like randomized smoothing and Lipschitz-constrained training, and developing formal methods for specifying safe agent behaviors.
Programming experience, some experience with using AI based systems and mathematical maturity would be great for all the projects.
Beyond that, if someone has prior experience with building AI benchmarks, red teaming, formal methods etc. that would be great too.
We are excited to supervise projects that fall within the two following categories:
For 1., we are particularly interested in:
For 2., we are especially interested in:
Essential knowledge:
Essential experience:
Desired experience:
Bonus:
Lee's stream will focus primarily on improving mechanistic interpretability methods for reverse-engineering neural networks.
Mentorship looks like a 1 h weekly meeting by default with approximately daily slack messages in between. Usually these meetings are just for updates about how the project is going, where I’ll provide some input and steering if necessary and desired. If there are urgent bottlenecks I’m more than happy to meet in between the weekly interval or respond on slack in (almost always) less than 24h. We'll often run daily standup meetings if timezones permit, but these are optional.
As an indicative guide (this is not a score sheet), in no particular order, I evaluate candidates according to:
In the past cohort I chose a diversity of candidates with varying strengths and I think this worked quite well. Some mentees were outstanding in particular dimensions, others were great all rounders.
In general I'd like projects in my stream to at least be informed by SPD if not build on it directly. Scholars and I will discuss projects and come to a consensus on what feels like a good direction. I will not tell scholars to work on a particular direction, since in my experience intrinsic motivation to work on a particular direction is important for producing good research.
This stream will work on projects that empirically assess national security threats of AI misuse (CBRN terrorism and cyberattacks) and improve dangerous capability evaluations. Threat modeling applicants should have a skeptical mindset, enjoy case study work, and be strong written communicators. Eval applicants should be able and excited to help demonstrate concepts like sandbagging elicitation gaps in an AI misuse context.
Typically, this would include weekly meetings, detailed comments on drafts, and asynchronous messaging.
For threat modeling work: Skeptical mindset, transparent reasoning, analytical
For evaluations, mitigations, and verification work: LLM engineering skills (e.g., agent orchestration), biosecurity knowledge
Mentor(s) will talk through project ideas with scholar
Priority directions:
I usually spend at least 30 min per week in one-one-one meetings with my mentees. We can also discuss longer time slots if necessary. Besides these time slots, I try to be as responsive as possible over Slack (>2 comprehensive responses per day) and read relevant papers between weekly meetings.
I'm looking for the following skills:
I would prefer to set the overall direction, but I will listen closely to scholars about their preferences within a broad direction. Converging on a particular topic is expected to be a collaborative process.
We will continue working on black-box monitors for scheming in complex agentic settings, building on the success of the previous stream.
See here for details.
We have two weekly 60-minute calls by default. Since everyone will work on the same project, these calls will be with all participants of the stream. I respond on slack on a daily basis for asynchronous messages. Scholars will have a lot of freedom for day-to-day decisions and direction setting. In the best case, you will understand the project better than me after a few weeks and have a clear vision for where it should be heading. I recommend scholars focus 100% of their work time on the project and not pursue anything on the side. I think this way people will learn the most in MATS.
You will work on subprojects of black box monitoring. See here for details.
AI control focussed stream, probably running in-person in London.
I'm pretty hands-off. I expect scholars to fully take charge of the project, and update / consult me as needed. I do want my scholars to succeed, and am happy to advise on project direction, experiment design, interpreting results, decision-making / breaking ties, or getting unstuck.
During the program, we'll meet once a week to go through any updates / results, and your plans for the next week. I'm also happy to comment on docs, respond on Slack, or have additional ad hoc meetings when useful.
I'll propose ~5 projects for scholars to red-team, flesh out and decide on one to own. I'm also open to scholar-proposed projects if they sound promising; I'd just be less useful as an advisor.
The MATS Program is a 10-week research fellowship designed to train and support emerging researchers working on AI alignment, transparency and security. Fellows collaborate with world-class mentors, receive dedicated research management support, and join a vibrant community in Berkeley focused on advancing safe and reliable AI. The program provides the structure, resources, and mentorship needed to produce impactful research and launch long-term careers in AI safety.
MATS mentors are leading researchers from a broad range of AI safety, alignment, governance, field-building and security domains. They include academics, industry researchers, and independent experts who guide scholars through research projects, provide feedback, and help shape each scholar’s growth as a researcher. The mentors represent expertise in areas such as:
Key dates
Application:
The main program will then run from September 28th to December 4th, with the extension phase for accepted fellows beginning in December.
MATS accepts applicants from diverse academic and professional backgrounds - from machine learning, mathematics, and computer science to policy, economics, physics, cognitive science, biology, and public health, as well as founders, operators, and field-builders without traditional research backgrounds. The primary requirements are strong motivation to contribute to AI safety and evidence of technical aptitude, research potential, or relevant operational experience. Prior AI safety experience is helpful but not required.
Applicants submit a general application, applying to various tracks (Empirical, Theory, Strategy & Forecasting, Policy & Governance, Systems Security, Biosecurity, Founding & Field-Building.
In stage 2, applicants apply to streams within those tracks as well as completing track specific evaluations.
After a centralized review period, applicants who are advanced will then undergo additional evaluations depending on the preferences of the streams they've applied to before doing final interviews and receiving offers.
For more information on how to get into MATS, please look at this page.