Streams in this track include hands-on research using machine learning experiments to understand and improve model safety including AI control, interpretability, scalable oversight, evaluations, red-teaming, and robustness. This is the largest track in the program and is defined by its methods rather than any single research agenda. If your primary tool is ML engineering, this is your track.
The track is defined by its methodology more than by any single research agenda. Fellows run ML experiments to understand and improve the safety properties of frontier models, with work spanning interpretability, AI control, scalable oversight, evaluations, red-teaming, robustness, and model organisms of misalignment. The unifying thread is that progress comes from getting hands on real models (training, probing, fine-tuning, measuring) rather than reasoning from first principles alone. This is the largest track in the program and the most common entry point into technical AI safety research.
We are looking for fellows whose primary tool is ML engineering, broadly construed. The essential requirement is the ability to design and run experiments on language models or other deep learning systems and iterate quickly on the results. In practice that usually means strong Python (with and without AI coding tools), comfort with the infrastructure around running models at moderate scale, and enough research taste to know which experiments are worth running. Mission alignment matters: fellows should be able to say why a given line of empirical work meaningfully reduces frontier risk, not just whether it yields a successful publication. Educational background and seniority are weighted lightly here relative to other tracks. Past cohorts have included strong fellows ranging from undergraduates to senior industry researchers.
Fellows are matched to mentors based on fit, and projects are scoped to produce concrete artifacts by program end: papers, evaluation suites, open-source tooling, or technical reports. Target audiences include safety and alignment teams at frontier labs, governments and other evaluation organizations, the broader ML research community.
I prefer a weekly meeting cadence of at least one research meeting per week, where we discuss results from the previous week and potential next steps, and just generally align ourselves on priorities and stay motivated. I'm also a fan of relatively few meetings, and much more support given asynchronously, so I can think carefully about my responses and help throughout the process.
I have a decent amount of experience on the technical side, and so in the past have had good experiences unblocking scholars when they were stuck on technical obstacles right away (e.g. low-level bugs like memory issues, taking a step back and thinking about alternative approaches, etc). For example, I'm a huge fan of impromptu pair programming sessions to debug things together, and I always learn new things from dropping into someone's workflow. I'm also happy to help clarify things conceptually and just brainstorm together. The two biggest bottlenecks in my experience have been 1) getting stuck on technical obstacles and 2) conceptually understanding the problem we're trying to solve.
I'm open to a wider variety of skillsets, but these would be a big plus:
I would be happy to suggest concrete project ideas and help with brainstorming topic choices, or help guide an existing project that the scholar is interested in. My preference is that the scholar picks a category that overlaps with an area I actively work on so that I can give effective high-level advice.
Implementing SL4/5 and searching for differentially defense-favored security tools.
I love asynchronous collaboration and I'm happy to provide frequent small directional feedback, or do thorough reviews of your work with a bit more lead time. A typical week should look like either trying out a new angle on a problem, or making meaningful progress towards productionizing an existing approach.
Essential:
Preferred:
Mentor(s) will talk through project ideas with scholar, or scholar will pick from a list of projects.
This stream will pursue research on securing and hardening AI systems through rigorous testing, provable defenses, and formal specification, including improving benchmarks for agentic security, scaling mathematically-grounded robustness techniques like randomized smoothing and Lipschitz-constrained training, and developing formal methods for specifying safe agent behaviors.
Programming experience, some experience with using AI based systems and mathematical maturity would be great for all the projects.
Beyond that, if someone has prior experience with building AI benchmarks, red teaming, formal methods etc. that would be great too.
We are excited to supervise projects that fall within the two following categories:
For 1., we are particularly interested in:
For 2., we are especially interested in:
Essential knowledge:
Essential experience:
Desired experience:
Bonus:
Lee's stream will focus primarily on improving mechanistic interpretability methods for reverse-engineering neural networks.
Mentorship looks like a 1 h weekly meeting by default with approximately daily slack messages in between. Usually these meetings are just for updates about how the project is going, where I’ll provide some input and steering if necessary and desired. If there are urgent bottlenecks I’m more than happy to meet in between the weekly interval or respond on slack in (almost always) less than 24h. We'll often run daily standup meetings if timezones permit, but these are optional.
As an indicative guide (this is not a score sheet), in no particular order, I evaluate candidates according to:
In the past cohort I chose a diversity of candidates with varying strengths and I think this worked quite well. Some mentees were outstanding in particular dimensions, others were great all rounders.
In general I'd like projects in my stream to at least be informed by SPD if not build on it directly. Scholars and I will discuss projects and come to a consensus on what feels like a good direction. I will not tell scholars to work on a particular direction, since in my experience intrinsic motivation to work on a particular direction is important for producing good research.
This stream will work on projects that empirically assess national security threats of AI misuse (CBRN terrorism and cyberattacks) and improve dangerous capability evaluations. Threat modeling applicants should have a skeptical mindset, enjoy case study work, and be strong written communicators. Eval applicants should be able and excited to help demonstrate concepts like sandbagging elicitation gaps in an AI misuse context.
Typically, this would include weekly meetings, detailed comments on drafts, and asynchronous messaging.
For threat modeling work: Skeptical mindset, transparent reasoning, analytical
For evaluations, mitigations, and verification work: LLM engineering skills (e.g., agent orchestration), biosecurity knowledge
Mentor(s) will talk through project ideas with scholar
Priority directions:
I usually spend at least 30 min per week in one-one-one meetings with my mentees. We can also discuss longer time slots if necessary. Besides these time slots, I try to be as responsive as possible over Slack (>2 comprehensive responses per day) and read relevant papers between weekly meetings.
I'm looking for the following skills:
I would prefer to set the overall direction, but I will listen closely to scholars about their preferences within a broad direction. Converging on a particular topic is expected to be a collaborative process.
We will continue working on black-box monitors for scheming in complex agentic settings, building on the success of the previous stream.
See here for details.
We have two weekly 60-minute calls by default. Since everyone will work on the same project, these calls will be with all participants of the stream. I respond on slack on a daily basis for asynchronous messages. Scholars will have a lot of freedom for day-to-day decisions and direction setting. In the best case, you will understand the project better than me after a few weeks and have a clear vision for where it should be heading. I recommend scholars focus 100% of their work time on the project and not pursue anything on the side. I think this way people will learn the most in MATS.
You will work on subprojects of black box monitoring. See here for details.
The MATS Program is a 10-week research fellowship designed to train and support emerging researchers working on AI alignment, transparency and security. Fellows collaborate with world-class mentors, receive dedicated research management support, and join a vibrant community in Berkeley focused on advancing safe and reliable AI. The program provides the structure, resources, and mentorship needed to produce impactful research and launch long-term careers in AI safety.
MATS mentors are leading researchers from a broad range of AI safety, alignment, governance, field-building and security domains. They include academics, industry researchers, and independent experts who guide scholars through research projects, provide feedback, and help shape each scholar’s growth as a researcher. The mentors represent expertise in areas such as:
Key dates
Application:
The main program will then run from September 28th to December 4th, with the extension phase for accepted fellows beginning in December.
MATS accepts applicants from diverse academic and professional backgrounds - from machine learning, mathematics, and computer science to policy, economics, physics, cognitive science, biology, and public health, as well as founders, operators, and field-builders without traditional research backgrounds. The primary requirements are strong motivation to contribute to AI safety and evidence of technical aptitude, research potential, or relevant operational experience. Prior AI safety experience is helpful but not required.
Applicants submit a general application, applying to various tracks (Empirical, Theory, Strategy & Forecasting, Policy & Governance, Systems Security, Biosecurity, Founding & Field-Building.
In stage 2, applicants apply to streams within those tracks as well as completing track specific evaluations.
After a centralized review period, applicants who are advanced will then undergo additional evaluations depending on the preferences of the streams they've applied to before doing final interviews and receiving offers.
For more information on how to get into MATS, please look at this page.