The Summer 2026 program will run from June through August. It will be largest MATS program to date with 120 fellows and 100 mentors. Fellows will be connected with mentors or organizational research groups, such as Anthropic's Alignment Science team, UK AISI, Redwood Research, ARC, and LawZero, to collaborate on a research project over the summer. Some fellows will be offered a 6+ month extension to continue this collaboration.

Key dates for the application and admissions timeline
General Application (December 16th to January 18th)
Applicants fill out a general application which should take 1-2 hours. Applications are due by January 18th.
Additional Evaluations (Late January through March)
Applicants that are advanced in the applications process go through additional evaluations including reference checks, coding tests, work tests, and interviews. Which evaluations you will undergo depend on the mentors and streams you apply to.
Admissions Decisions (Early April)
Selected applicants are notified of their acceptance and anticipated mentor later in the application cycle.
The main program takes place from early June to late August of 2026. It is an intensive research phase, where fellows work full time on a research project in AI alignment, security, or governance. Fellows' research directions will typically be chosen through a collaborative process with their mentors, and fellows are expected to develop their independent research direction as the program continues.
While mentor support will vary depending on the project and mentors, mentors are expected to spend at least 1 hour/week working with each of their scholars, and some spend much more time. Scholars will also receive support from MATS’s Research Management team, who help to scope out and structure research direction.
Approximately one month into the program, scholars are expected to write a short Research Plan outlining their projects’ threat model, theory of change, and project deliverables. At the end of the program scholars will give a brief presentation at the Scholar Symposium on project work conducted over the course of MATS.
Educational seminars and workshops will be held 2-3 times per week. Previously, speakers have included Buck Shlegeris from Redwood Research, Adam Gleave from FAR AI, Neel Nanda from Google DeepMind, William Saunders from OpenAI, Andrew Critch from CHAI, Lennart Heim from GovAI, Ajeya Cotra from Open Philanthropy, and more.
The extension phase starts in September of 2026. Fellows who demonstrate promise as independent researchers during the main program can apply for the MATS extension phase. Acceptance into the extension is based on evaluation of fellows' research plans by an independent technical program committee and mentor endorsement.
The extension phase offers a default 6-month continuation, with exceptional scholars eligible for a 12-month Fellowship. Beginning four weeks after the end of the main program (with flexible start dates), extension fellows primarily work from Berkeley, California, the MATS London office, other AI safety hubs, or fully remotely.
MATS arranges funding for stipends, housing, and compute resources for accepted extension fellows, creating a seamless transition into this advanced phase of the program. Historically around 70% of fellows are accepted into the extension.
MATS aims to accelerate researchers who will:
MATS alumni have gone on to publish safety research, join alignment organizations, including Anthropic and MIRI, and found an alignment research lab. You can read more about MATS alumni here.
MATS supports researchers in a variety of research tracks, which includes technical governance, empirical, policy & strategy, theory, and compute governance. MATS fellows participate in a research stream consisting of their mentor(s) and other mentees. You can specify which tracks and streams to apply to in the general application. Each stream provides its own research agenda, methodology, and mentorship focus. You can also view this list as a grid here.
Neel takes a pragmatic approach to interpretability: identify what stands between where we are now and where we want to be by AGI, and then focus on the subset of resulting research problems that can be tractably studied on today's models. This can look like diving deep into the internals of the model, or simpler black box methods like reading and carefully intervening on the chain of thought - whatever is the right tool for the job. This could look like studying how to detect deception, understanding why a model took a seemingly concerning action, or fixing weak points in other areas of safety, e.g. using interpretability to stop models realising they are being tested. You can learn more about Neel's approach in this podcast.
He has spent far too much time having MATS scholars, and has worked with ~60 so far - he’s excited to take on even more!
We are interested in mentoring projects in AI forecasting and governance. This work would build on the AI 2027 report to either do more scenario forecasting or explore how to positively affect key decision points, informed by our scenario.
We will have meetings each week to check in and discuss next steps. We will be consistently available on Slack in between meetings to discuss your research, project TODOs, etc.
The most important characteristics include:
Also important, but not required characteristics include:
We will talk through project ideas with scholar
Agent Foundations research focused on clarifying conditions under which humans can justifiably trust artificial intelligence systems.
We can discuss this more and decide on a different structure, but by default, 1 hour 1-on-1 meetings with each scholar once a week, plus a 2 hour group meeting which may also include outside collaborators.
Essential:
Preferred:
Quality of fit is roughly proportional to philosophical skill times mathematical skill. Someone with excellent philosophical depth and almost no mathematics could be an OK fit, but would probably struggle to produce or evaluate proofs. Someone with excellent mathematical depth but no philosophy could be an OK fit, but might struggle to understand what assumptions and theorems are useful/interesting.
There will be some flexibility about what specific projects scholars will pursue. Abram will discuss the current state of his research with scholars and what topics scholars are interested in, aiming to settle on a topic by or before week 2.
Alignment is solved for models in the current paradigm. This shifts the threat model to good old human conflict, so I'm excited about coordination tech (AI cooperation, datacenter workload verification). For aligning future models, we have to forecast what future AGIs will look like and solve issues before they come up. I’m excited about models that maintain their goodness under self-directed learning and can align their successor.
Every week we have a meeting, where you are expected to bring up questions, problems that are preventing your progress, or things you would like advice on. We set goals for the next week. In the week between meetings, you work towards the agreed-upon goals. I am available to unblock via Slack or short meetings if necessary.
The two main qualities I look for in a scholar are:
Other important things:
Other nice things:
Not a good fit if:
We'll talk about possible projects together. By the end of week 1 we should have something that we're both excited about, and fix the decision in place in the middle of week 2.
I’m taking two scholars, and hoping that both of you and I can all agree on a project together. I think a tiny research team together can keep each other motivated and accomplish much more than two separate scholar-Adrià teams.
This stream focuses on empirical AI control research, including defending against AI-driven data poisoning, evaluating and attacking chain-of-thought monitorability, and related monitoring/red-teaming projects. It is well-suited to applicants already interested in AI safety with solid Python skills, and ideally prior research or familiarity with control literature/tools (e.g. Inspect/ControlArena).
1-hour weekly meetings for going through your research log & high level guidance. Daily updates on slack are also very useful and I typically reply within 2 days to any questions.
Essential:
You may be a good fit if you also have some of:
Not a good fit:
By default I'll propose several projects for you to choose from, but you can also pitch ideas that you're interested in.
Building realistic defensive cybersecurity benchmarks. Asymmetric Security responds to real cyber incidents and therefore holds data not available in the public domain. We would like to work with MATS scholars to build realistic benchmarks grounded in these real cyber incidents.
1 hour weekly meetings by default for high-level guidance. We will respond within a day to async communication.
Essential:
Preferred:
We will assign the project direction; scholars will have significant tactical freedom.
The Alignment Research Center is a small non-profit research group based in Berkeley, California, that is working on a systematic and theoretically grounded approach to mechanistically explaining neural network behavior. We are interested in scholars with a strong math background and mathematical maturity. If you'd be excited to work on the research direction described in this blog post – then we'd encourage you to apply!
Scholars will work out of ARC's offices in Berkeley (though we might take a London-based scholar as well). Each scholar will meet with their mentor at least once a week for an hour, though 2-3 hours per week is not uncommon. Besides time with their official mentor, scholars will likely spend time working in collaboration with other researchers; a typical scholar will likely spend about 25% of their time actively collaborating or learning about others' research.
Essential:
Preferred:
Each scholar will be paired with the mentor that best suits their skills and interests. The mentor will discuss potential projects with the scholar, and they will decide what project makes the most sense, based on ARC's research goals and the scholar's preferences.
Most scholars will work on multiple projects over the course of their time at ARC, and some scholars will work with multiple mentors.
This coalition of mentors make up the “megastream”. This stream spans a range of empirical research areas in AI safety on LLMs, including AI control, scalable oversight, model organisms, model internals, model welfare, security, and more. You’ll be pitched, and have the option to pitch, a variety of safety research projects, and then be matched to projects and mentors based on your interests/preferences on research and what you’d like to get out of MATS. Scholars in this stream frequently receive funding and continued mentorship after MATS to complete their research project, usually leading to a (co-)first author paper. People in this stream often end up in long-term homes for safety research after MATS (e.g. Anthropic, Redwood Research, OpenAI).
Megastream mentors share an application, tend to collaborate and co-mentor projects together, and generally share infrastructure to streamline the scholar experience. By applying to this stream, you are being considered for all of the megastream mentors. In the application process, you can indicate particular mentors you are interested in working with.
During the program, scholars meet weekly with their project mentors and collaborators. Some projects meet more often without mentors (e.g., daily standups with the peers on the project). Each project will have a primary mentor, who is also the main decision-maker on key milestones for the project and who is the default person to go to for feedback, advice, etc. Co-mentors also attend project meetings as needed and provide feedback throughout the program. Some project co-mentors can be as involved as the primary mentor.
Mentorship starts with the “Project Pitch Session” Anthropic runs at the start of the program. During this session, dozens of researchers from Anthropic, Redwood, OpenAI, and other AI Safety orgs pitch projects they’d be excited to work on. Scholars get ~1 week to derisk and trial projects before submitting their preferences. Starting on week 2, scholars are assigned projects where the primary mentor is whoever pitched it (e.g. Ethan, Buck S, Evan, etc.). Some projects are assigned co-mentors who are other supervisors who want to join the project.
Arthur Conmy's MATS Stream focuses on evaluating interpretability techniques on current and future AI Safety problems.
This can involve creating new safety techniques, as well as creating benchmarks and measuring performance against baseline techniques.
I meet 1h/week, in group meetings (scheduled).
I also fairly frequently schedule ad hoc meetings with scholars to check on how they're doing and to address issues or opportunities that aren't directly related to the project.
I'll help with research obstacles, including outside of meetings.
Executing fast on projects is highly important. But also having a good sense of which next steps are correct is also valuable, though I enjoy being pretty involved in projects, so it's somewhat easier for me to steer projects than it is for me to teach you how to execute fast from scratch. It helps to be motivated to make interpretability useful, and use it for AI Safety, too.
I will also be interviewing folks doing Neel Nanda's MATS research sprint who Neel doesn't get to work with.
Mentor(s) will talk through project ideas with scholar.
In the face of disaster, I suspect the government will be forced to play insurer of last resort, whether for a particular lab, or society at large. (I'm not the only to suspect this – see e.g. here). Designed well, I believe a federal insurance backstop could internalize catastrophic negative externalities; designed poorly, it will simply be a subsidy for AI companies. I want to design the good version, so we have it ready.
I encourage people with mechanism design (a.k.a. reverse game theory) expertise to apply, but don't be deterred if you don't have this expertise.
1 hour weekly meetings by default for high-level guidance. I'm active on Slack and typically respond within a day for quick questions or conceptual (not code) debugging. Between meetings, expect async back-and-forth on paper structure, or experiment design and results. Scholars can also schedule ad-hoc calls if they're stuck or want to brainstorm—just ping me on Slack.
Depending on the project, I may help with writing.
If interested in the technical paper, applicants must:
For all applicants:
Preferred:
Nice to haves:
Not a good fit:
For technical versions of this project, I suspect the project will automatically be fairly tightly scoped based on the scholar's expertise. I will pose the core challenge and over the first week, the scholar and I will hammer out exactly what theoretical questions need answering + empirical surveys need running.
For non-technical versions of this project, I will pitch a few different projects and scholars will try ones they find interesting for a week. In week 2 we'll settle on one together.
This stream focuses on representations that underlie how language models generalize, for example representations of personas, goals, or training data components.
1 hour/week meetings + async discussions in Slack threads; can schedule additional meetings ad hoc as needed.
Essential:
Preferred:
We'll go through potential projects at the beginning, and scholars can propose alternatives. Scholars should explore the first week or two, and decide on a project direction in the second week.
We study applications of singular learning theory (SLT) to AI safety, with a focus on interpretability and alignment. Ideal candidates come from a strong technical background in mathematics, physics, computer science, or biology, and aren't afraid to get their hands dirty with ML experiments. We don't expect you to have deep expertise in SLT, but a shallow familiarity will help.
The team will meet weekly together with both mentors. Separately, you will meet 1-on-1 with at least one of the mentors every other week. We conduct our asynchronous communications through an internal Discord server. We expect you to schedule additional pair-programming/debugging calls with other people on the team as needed.
We'll help with research obstacles, including outside of meetings.
If you're interested in working on more of the empirical side, you should have prior experience with ML engineering (at least at the level of a program like ARENA) and prior research experience (potentially in a field outside of ML). A bonus would be prior familiarity with designing and running ML experiments or research specifically in AI safety.
If you're interested in working on more of the theoretical side, you should have prior research experience in a relevant field like mathematics, theoretical physics, or theoretical computer science.
Please make sure that your background and interests are clearly described in your application. By default, we'll be looking for evidence of research ability in the form of publications.
We do not expect you to already be aware of SLT, but if you pass the first round, please prepare by conducting some background reading (see: timaeus.co/learn).
Mentor(s) will talk through project ideas with scholar and suggest several options to choose from.
I have two broad areas.
Security:
I am interested in building demonstrations for hacking real-world AI deployments to show that they are not secure. The goal is to force companies to invest in alignment techniques that can solve the underlying security issues.
Benchmarks:
I am interested in building benchmarks to determine how generalizable modern LLM techniques actually are, now that we are no longer in the pre-training scaling era.
I will meet 1-1 or as a group, depending on the interests as they relate to the projects. Slack communication outside of the 1-1.
I strongly prefer multiple short meetings over single long meetings, except at the start.
I'll help with research obstacles, including outside of meetings
For security:
You should have a strong security mindset, having demonstrated the willingness to be creative on this. I would like to see past demonstration of willingness to get your hands dirty and try many different systems.
For benchmarks:
As creative as possible, willingness to work on the nitty gritty, willingness to work really hard on problems other people fine boring. As interests as far away from SF-related interests as possible.
Mentor(s) will talk through project ideas with scholar
This stream will focus on monitoring, stress-testing safety methods, and evals, with a focus on risks from scheming AIs. Examples include (black-box) AI control techniques, white-box monitors (probes etc.), chain-of-thought monitoring/faithfulness, building evaluation environments, and stress-testing mitigations.
For each project, we will have a weekly meeting to discuss the overall project direction and prioritize next steps for the upcoming week. On a day-to-day basis, you will discuss experiments and write code with other mentees on the project (though I'm available on Slack for quick feedback between meetings or to address things that are blocking you).
I structure the program around collaborative, team-based research projects. You will work in a small team, on a project from a predefined list. I organize the 12-week program into fast-paced research sprints designed to create and keep research velocity, so you should expect regular deadlines and milestones. I will provide a more detailed schedule and set of milestones at the beginning of the program.
I am looking for scholars with strong machine learning engineering skills, as well as a background in technical research. While I’ll provide weekly guidance on research, I expect scholars to be able to run experiments and decide on low-level details fairly independently most of the time. I’ll propose concrete projects to choose from, so you should not expect to work on your own research idea during MATS. I strongly encourage collaboration within the stream, so you should expect to work in teams of 2-3 scholars on a project, hence good communication and team skills are important.
We will most likely have a joint project selection phase, where we present a list of projects (with the option for scholars to iterate on them). Afterward, each project will have at least one main mentor, but we might also co-mentor some projects.
This stream will work on gathering and analyzing data in order to shed light on the driving forces behind AI and monitor its impacts.
Scholars will have individual weekly meetings for half an hour with their mentor, as well as a group meeting with their mentor for half an hour. Additionally, scholars will attend Epoch’s weekly Work In Progress meeting.
Some useful characteristics (don't need all of these):
Scholar will pick from a list of projects
AI macrostrategy: strategic questions about how the transition to advanced AI will happen, and what we can do now to prepare for it.
Topics of interest include better futures, power concentration, takeoff speeds, deals with AIs, space governance, and acausal trade.
Each scholar will be assigned a primary mentor who will meet with them once a week. The specifics will depend on the candidate and project.
We’re looking for people who:
It’s a bonus if you already have research experience, or have domain knowledge in a relevant field like philosophy or economics.
For project ideas, see here
In this project, we will explore GPU side-channel attacks to extract information about model usage. A simple example is to observe (via radio, power fluctuations, acoustics, etc.) which experts were used in each forward pass of an MOE model, then use those observations to guess which tokens were produced.
Co-working 2-4 hours per week, including detailed guidance. Flexible. 1 hour check-ins per week. You can schedule ad-hoc calls if stuck or wanting to brainstorm.
Please note: experience with hardware is not a requirement for this stream, as long as you are willing to work hard and learn fast, and can show other evidence of exceptional ability. If in doubt: we encourage you to apply!
We will provide you with a lot of autonomy and plug-and-play access to a rare combination of tools and equipment—in exchange we expect you to have a strong self-direction, intellectual ambition, and a lot of curiosity. This stream requires you to have a tight experiment loop to form and test hypotheses on the fly.
Example skill profiles:
Must have: Trained or fine-tuned a transformer language model in PyTorch (toy models and following guides is fine). Familiar with basic electronics concepts (voltage, current, transistors). Has experience writing research papers, even as a class assignment.
Nice to have: Familiarity with LaTeX, PyTorch internals, CUDA/OpenCL, GPU architecture, chip design, oscilloscopes, signal processing, electrical engineering.
There is a cluster of potential projects to choose from. As a team, we will decide which to pursue based on individual interest and skills. Mentors will pitch example projects and scholars can then modify and re-pitch them. Once the research problem, hypothesis, and testing plan are written and agreed on, scholars begin object-level work. We encourage failing fast and jumping to a fallback project.
I'm interested in mentoring projects related to reward hacking and monitoring (agentic) models that produces long and complex trajectories. Scholar will have freedom to propose projects within this scope. Expect 30-60min 1-1 time on zoom.
30min to 1 hour weekly meetings (on zoom) by default for high-level guidance. I'm active on Slack and typically respond within a day for quick questions or conceptual (not code) debugging. Expect async back-and-forth on experiment design and results between meetings. Scholars can also schedule ad-hoc calls if they're stuck or want to brainstorm—just ping me on Slack.
Week 1-2: Mentor will provide high level directions or problems to work on, and scholar will have the freedom to propose specific projects and discuss with mentor.
Week 3: Figure out detailed plan of the project.
I prefer a weekly meeting cadence of at least one research meeting per week, where we discuss results from the previous week and potential next steps, and just generally align ourselves on priorities and stay motivated. I'm also a fan of relatively few meetings, and much more support given asynchronously, so I can think carefully about my responses and help throughout the process.
I have a decent amount of experience on the technical side, and so in the past have had good experiences unblocking scholars when they were stuck on technical obstacles right away (e.g. low-level bugs like memory issues, taking a step back and thinking about alternative approaches, etc). For example, I'm a huge fan of impromptu pair programming sessions to debug things together, and I always learn new things from dropping into someone's workflow. I'm also happy to help clarify things conceptually and just brainstorm together. The two biggest bottlenecks in my experience have been 1) getting stuck on technical obstacles and 2) conceptually understanding the problem we're trying to solve.
I'm open to a wider variety of skillsets, but these would be a big plus:
I would be happy to suggest concrete project ideas and help with brainstorming topic choices, or help guide an existing project that the scholar is interested in. My preference is that the scholar picks a category that overlaps with an area I actively work on so that I can give effective high-level advice.
Janet Egan will mentor scholars working on policy-relevant questions at the intersection of AI compute, geopolitics, and infrastructure. Potential projects include analyzing remote access to AI chips (e.g., via cloud providers in China), mapping and interpreting the global buildout of AI data centers and energy infrastructure, and developing politically informed strategies for US–China cooperation on AI risk. The mentee will lead their research project with weekly guidance, feedback, and optional career and policy insights.
After discussing and agreeing a topic, the mentee will play a leading role in driving the research forward, and be provided with weekly check-ins, advice and written feedback. Optional support would include introductions to others in the field, insights into policymaking and career advice.
Proactive, motivated individuals with experience getting deep on techy issues. Excellent attention to detail and a curious mindset. Strong communication skills and an interest in conveying technical concepts to policy and generalist audiences. An interest in data centers, geopolitics and/or energy infrastructure is welcome.
Mentor will talk through project ideas with scholar
Implementing SL4/5 and searching for differentially defense-favored security tools.
I love asynchronous collaboration and I'm happy to provide frequent small directional feedback, or do thorough reviews of your work with a bit more lead time. A typical week should look like either trying out a new angle on a problem, or making meaningful progress towards productionizing an existing approach.
Essential:
Preferred:
Mentor(s) will talk through project ideas with scholar, or scholar will pick from a list of projects.
This stream will pursue research on securing and hardening AI systems through rigorous testing, provable defenses, and formal specification, including improving benchmarks for agentic security, scaling mathematically-grounded robustness techniques like randomized smoothing and Lipschitz-constrained training, and developing formal methods for specifying safe agent behaviors.
Programming experience, some experience with using AI based systems and mathematical maturity would be great for all the projects.
Beyond that, if someone has prior experience with building AI benchmarks, red teaming, formal methods etc. that would be great too.
We are excited to supervise projects that fall within the two following categories:
For 1., we are particularly interested in:
For 2., we are especially interested in:
Essential knowledge:
Essential experience:
Desired experience:
Bonus:
Lee's stream will focus primarily on improving mechanistic interpretability methods for reverse-engineering neural networks.
Mentorship looks like a 1 h weekly meeting by default with approximately daily slack messages in between. Usually these meetings are just for updates about how the project is going, where I’ll provide some input and steering if necessary and desired. If there are urgent bottlenecks I’m more than happy to meet in between the weekly interval or respond on slack in (almost always) less than 24h. We'll often run daily standup meetings if timezones permit, but these are optional.
As an indicative guide (this is not a score sheet), in no particular order, I evaluate candidates according to:
In the past cohort I chose a diversity of candidates with varying strengths and I think this worked quite well. Some mentees were outstanding in particular dimensions, others were great all rounders.
In general I'd like projects in my stream to at least be informed by SPD if not build on it directly. Scholars and I will discuss projects and come to a consensus on what feels like a good direction. I will not tell scholars to work on a particular direction, since in my experience intrinsic motivation to work on a particular direction is important for producing good research.
This stream will work on projects that empirically assess national security threats of AI misuse (CBRN terrorism and cyberattacks) and improve dangerous capability evaluations. Threat modeling applicants should have a skeptical mindset, enjoy case study work, and be strong written communicators. Eval applicants should be able and excited to help demonstrate concepts like sandbagging elicitation gaps in an AI misuse context.
Typically, this would include weekly meetings, detailed comments on drafts, and asynchronous messaging.
For threat modeling work: Skeptical mindset, transparent reasoning, analytical
For evaluations, mitigations, and verification work: LLM engineering skills (e.g., agent orchestration), biosecurity knowledge
Mentor(s) will talk through project ideas with scholar
Priority directions:
I usually spend at least 30 min per week in one-one-one meetings with my mentees. We can also discuss longer time slots if necessary. Besides these time slots, I try to be as responsive as possible over Slack (>2 comprehensive responses per day) and read relevant papers between weekly meetings.
I'm looking for the following skills:
I would prefer to set the overall direction, but I will listen closely to scholars about their preferences within a broad direction. Converging on a particular topic is expected to be a collaborative process.
We will continue working on black-box monitors for scheming in complex agentic settings, building on the success of the previous stream.
See here for details.
We have two weekly 60-minute calls by default. Since everyone will work on the same project, these calls will be with all participants of the stream. I respond on slack on a daily basis for asynchronous messages. Scholars will have a lot of freedom for day-to-day decisions and direction setting. In the best case, you will understand the project better than me after a few weeks and have a clear vision for where it should be heading. I recommend scholars focus 100% of their work time on the project and not pursue anything on the side. I think this way people will learn the most in MATS.
You will work on subprojects of black box monitoring. See here for details.
AI control focussed stream, probably running in-person in London.
I'm pretty hands-off. I expect scholars to fully take charge of the project, and update / consult me as needed. I do want my scholars to succeed, and am happy to advise on project direction, experiment design, interpreting results, decision-making / breaking ties, or getting unstuck.
During the program, we'll meet once a week to go through any updates / results, and your plans for the next week. I'm also happy to comment on docs, respond on Slack, or have additional ad hoc meetings when useful.
I'll propose ~5 projects for scholars to red-team, flesh out and decide on one to own. I'm also open to scholar-proposed projects if they sound promising; I'd just be less useful as an advisor.
Escalation risks from state perceptions of AI capability, AI-enabled targeting, AI-enabled decision manipulation, and the impact of AI integration into nuclear command and control.
Mentorship will mostly consist of calls, sorting through research ideas and providing feedback. I'll be up to review papers, and potentially to meet in person depending on timing.
Looking for intellectually curious and honest scholars, with some background on topics related to national security, game theory, or AI-enabled military and influence capabilities.
I'll talk through project ideas with scholar, or the scholar can pick from a list of projects
This stream focuses on AI policy, especially technical governance topics. Tentative project options include: technical projects for verifying AI treaties, metascience for AI safety and governance, and proposals for tracking AI-caused job loss. Scholars can also propose their own projects.
We'll meet once or twice a week (~1 hr/wk total, as a team if it's a team project). I'm based in DC, so we'll meet remotely. I (Mauricio) will also be available for async discussion, career advising, and detailed feedback on research plans and drafts.
No hard requirements. Bonus points for research experience, AI safety and governance knowledge, writing and analytical reasoning skills, and experience relevant to specific projects.
I'll talk through project ideas with scholar
This stream will focus on the science and development of model evaluations, especially monitorability and alignment evals.
I'll meet with scholars 2x/week each. I'll also be generally available async and potentially for code review.
Various profiles could be a good fit.
Wanted:
Some of the following would be great but not essential:
I'll provide a list of possible projects to pick from, and talk through the options before making a decision.
Scholars can also suggest their own projects.
Research papers (technical governance or ML) related to evaluating and mitigating dangerous AI capabilities, with a focus on what's actionable and relevant for AGI companies
I like to get daily standup messages about progress that has made on the project, and I'm happy to provide some quick async feedback on new outputs. I'll also have weekly meetings. I'm based in Constellation in Berkeley.
Good writers/researchers who can work independently and autonomously! I'm looking for scholars who can ship a meaningful research output end-to-end and ideally have prior experience in writing relevant papers.
I may assign a project, have you pick from a list of projects, or talk through project ideas with you.
This stream will focus on projects to better understand the capabilities of the model on dangerous capabilities specially more related to security.
Also finding better ways to evaluate the safety and robustness of the models.
I'm interested in empirical projects that improve our ability to evaluate model capabilities or enable to understand or evaluate model monitorability. An ideal project culminates in a research output (conference/Arxiv paper or research blogpost with artifacts).
Time commitments: I expect to not be able to spend more than 5 hours on any week.
Meetings: I expect to have project meetings weekly for about an hour, where we chat about your results from last week, the planned next steps, any blockers or uncertainties. We'll have a monthly overall project check-in about broader progress towards overall goals.
Help outside of meetings: I am available to provide some help most weeks outside of the meeting, but by and large I expect mentees to be self-directed and self-sufficient in solving problems.
An ideal mentee has a strong AI research (software engineering is a plus) background. It's important that they are self-motivated and can make weekly progress with little intervention. If you are interested in working on non-concretely scoped projects, I would expect mentees to have the ability to write well-scoped project proposals, with realistic planned milestones and deliverables. Evidence of successful projects here would be very helpful in evaluating this.
A mentee can be a PhD student and they can work on a paper that will be part of their thesis.
I will talk through project ideas with the scholar
Making society safe from AI doesn't just mean making safe AI: we're figuring out how to uplift human collective intelligence, manage a highly multiagent world, improve foresight and institutional competence, ideally learning how to make best positive use of frontier AI systems as we go. FLF has a small, sharp team of researchers with a wide network, and we're looking to nurture new and missing approaches to minimising large-scale risks while steering to a flourishing future.
Willing to devote a few hours per week to this - I'll keep a 30m or 1h slot available weekly, and interact on Slack circa daily. Some closer projects might be much more interactive.
Depends a lot on direction. Ideally be able to make proposals and dig into things somewhat independently. Be good at explaining your thinking, and able+willing to teach me things!
For collective intelligence/human reasoning, I'd usually want someone very familiar with software production, at least skilled in software development or in product management and prototyping. Other candidates with great vision can succeed here if they're able to work with complementary talent to get things going.
For foresight, any of: polymathic/multi-STEM/futurism background, deep expertise in bio and/or AI, natsec experience or connections, unusual writer/game dev talent, safety engineering background, other background that you think I might want to hear about.
For multiagent accountability: law, economics, politics, history, or a combination, plus some familiarity with AI and agents.
I'll ask for interests and (if you have them) a proposal or two right away. We'll spend the first week or two iterating that, discussing other options, and maybe trying out little experiments. Likely we'll pick a direction then, but it's also fine if we pivot later.
Projects in this stream will be on AI welfare and moral status; more specifically, on what it takes to be a moral patient and how we can determine whether AI systems meet the conditions. I'm looking for applicants who have ideas about these topics and are motivated to explore them in more detail.
By default, scholars will meet with me online for 1hr/week and I will respond to questions on email/slack.
Scholars should have the following characteristics:
I will talk through project ideas with scholar
In this stream we will explore extensions and implications of our discovery that neural networks pretrained on next-token prediction represent belief-state geometry in their activations. We will build on this fundamental theory of neural network representations in order to discover what AI systems are thinking, and understand their emergent behaviors.
Early in the program, Paul and Adam will meet in person with scholars to help them get up to speed on the theoretical and technical background needed to understand and contribute to our framework. Subsequent weekly meetings with mentees aim to answer questions, unblock research, explore project ideas, and give feedback and suggestions on research.
The project can leverage applicants’ strengths in mathematical modeling and/or ML engineering. We welcome highly driven and relatively autonomous researchers that would like to benefit from our mentorship while taking the lead on a relevant project of their choice. The ideal scholar has the ability to move fast, and has experience in either research (e.g., PhD in any field), or software/ML engineering.
We will talk through project ideas with scholar
Peter Henderson’s stream focuses on developing safe, aligned AI agents, with projects on scalable oversight rules informed by law and game theory, safe long-horizon exploration, and measuring “jagged” capability/safety frontiers. Scholars will join an independently driven, engineering-heavy research environment, collaborating with other MATS scholars and PhD students, with weekly 1:1s and active async mentorship.
45 min weekly meetings by default for high-level guidance. I'm active on Slack for quick questions or conceptual (not code) debugging. Expect async back-and-forth on experiment design and results between meetings. Scholars can also schedule ad-hoc calls if they're stuck or want to brainstorm—just ping me on Slack. Other team members (PhD students) will also be around to help brainstorm, getting unstuck.
Essential:
Nice to have, but not necessary:
Not a good fit:
Mentors in the group will pitch projects, and scholars will try ones they find interesting for a week. We'll iterate together at the end of week 1 and pick final assignments in week 2.
The Redwood Research stream is looking for fast empirical iterators and strategists to work on control research.
Depending on the mentor:
We are looking for people who are:
We will assign projects by default but are open to getting pitched on projects.
My MATS fellows will do philosophical thinking about multi-agent intelligence and how agents change their values. This will likely involve trying to explore and synthesize ideas from game theory, signaling theory, reinforcement learning, and other related domains.
I'll come meet scholars in person around 2 days a week on average. On those days I'll be broadly available for discussions and brainstorming. On other days scholars can message me for guidance (though I'd prefer to spend most of my effort on this during the in-person days).
My main criterion for selecting scholars will be clarity of reasoning.
I will talk through project ideas with the scholar.
Roger Grosse’s stream investigates how to improve influence functions and other training data attribution methods, and uses these tools to study alignment-related phenomena such as out-of-context reasoning and emergent misalignment. The ideal scholar has experience with LLM internals, strong statistics/applied math skills (especially numerical linear algebra), and can independently drive research from literature review through experimentation and analysis. Roger provides shovel-ready projects while giving exceptional scholars freedom to pursue their own ideas, and is open to scholars collaborating with others.
I will meet with scholars 1 hour per week by default, and will be available to answer questions on Slack roughly daily.
I will give the scholar the level of freedom they are ready for. I will be prepared with focused, shovel-ready projects, but exceptional scholars with a vision they are excited about will have the flexibility to pursue it.
International coordination to reduce frontier AI risks, with a focus on China and the West.
1 hour weekly meetings by default for high-level guidance. We are active on Slack and typically respond within a day for quick questions.
Good understanding of international AI governance developments that are relevant to frontier AI safety (e.g., the Summit series, AISI network)
Good understanding of Chinese AI governance and safety (key players, key trends and institutional structures)
Good understanding of key frontier risk domains (CBRN, cyber, loss of control)
Some understanding of broader US-China relations and how they frame US-China relations on AI/AGI specifically
We will provide a shortlist of projects that we are keen for the scholar to work on in Week 1. We'll ask scholars to scope these in the 1st week and make a determination about which project to focus on in Week 2.
We build scalable technology for AI understanding and oversight.
You will work closely with a mentor through recurring meetings (group and individual) and Slack.
We're looking for strong, experienced software engineers or talented researchers who can hit the ground running and iterate quickly.
ML experience is a bonus but not required.
We will talk through project ideas with scholar
The stream will advance empirical methodologies for third-party AI safety evaluations. Example research topics include chain-of-thought monitorability, the secret loyalties research agenda, and automatic auditing (eg, with Anthropic’s Parallel Exploration Tool for Risky Interactions).
I will have scholars work in teams. During the week, the scholars will collaborate with each other and are encouraged to meet frequently. I will hold a weekly advising meeting for each project to provide help and guidance.
My projects will be prioritizing impact on the field of AI safety over academic novelty. Beyond the skills of doing empirical AI safety research, I am looking for collaborators who are excited about doing sound and impactful science - including the mundane aspects of doing good science.
I will provide a list of projects, but can also talk through other project ideas
The stream will focus on conceptual, empirical, and theoretical work on scalable oversight and control. This includes but is not limited to creating model organisms for specific failure modes, designing training procedures against them, and making progress on subproblems involved in safety cases.
A research agenda document will be shared ahead of time with a short list of project ideas. The scholars can also brainstorm and pitch ideas that are aligned with the research agenda. We will decide on assignments in week 2.
I (Cas) work on a range of projects from technical safeguards to technical governance. This stream follows an academic collaboration model and will work will likely focus on technical topics in AI governance.
2-3 meetings per week plus regular messaging and collaborative writing.
Green flags include:
Mentor(s) will talk through project ideas with scholar.
In the shard theory stream, we create qualitatively new methods and fields of inquiry, from steering vectors to gradient routing to unsupervised capability elicitation to robust unlearning. If you're theory-minded, maybe you'll help us formalize shard theory itself.
We will have weekly 1-1's and weekly team lunch, as well as asynchronous communication over Slack. Mentees are always welcome to reach out at any time, in case guidance is needed outside of usual meeting times.
Scholars should mostly figure things out on their own outside of meetings
Ideal candidates would have:
Mentor(s) will talk through project ideas with scholar
I mostly interested in AI control and scalable oversight. I'm excited to work with scholars interested in empirical projects building and evaluating control measures and oversight techniques for LLM agents, especially those based on chain of thought monitoring. I'm also interested in the science of chain of thought monitorability, misalignment and control. An ideal project ends with a paper submitted to NeurIPS/ICML/ICLR.
I'll meet with mentees once a week and will be available on Slack daily.
An ideal mentee has a strong AI research and/or software engineering background. A mentee can be a PhD student and they can work on a paper that will be part of their thesis.
I'll talk through project ideas with scholar
This stream is for the UK AISI Red-team. The team focuses on stress-testing mitigations for AI risk, including misuse safeguards, control techniques and model alignment red-teaming. We plan to work on projects building and improving methods for performing these kinds of evaluations and methods.
Each scholar will have one primary mentor from the Red Team who will provide weekly guidance and day-to-day support
Scholars will also have access to secondary advisors within their specific sub-team (misuse, alignment, or control) for technical deep-dives
Team lead Xander Davies and advisors Geoffrey Irving and Yarin Gal will provide periodic feedback through team meetings and project reviews
For scholars working on cross-cutting projects, we can arrange mentorship from multiple sub-teams as needed
Structure:
Weekly 1:1 meetings (60 minutes) with primary mentor for project updates, technical guidance, and problem-solving
Asynchronous communication via Slack/email throughout the week for quick questions and feedback
Bi-weekly team meetings where scholars can present work-in-progress and get broader team input
Working style:
We expect scholars to work semi-independently – taking initiative on their research direction while leveraging mentors for guidance on technical challenges, research strategy, and navigating AISI resources
Scholars will have access to our compute resources, pre-release frontier models, and operational support to focus on research
We encourage scholars to document their work and, if appropriate, aim for publication or public blog posts
We're looking for scholars with hands-on experience in machine learning and AI security, particularly those interested in adversarial robustness, red teaming, or AI safeguards. Ideal candidates would have:
We welcome scholars at various career stages especially those who are eager to work on problems with direct impact on how frontier AI is governed and deployed.
Scholars will choose from a set of predefined project directions aligned with our current research priorities, such as:
We'll provide initial direction and guidance on project scoping, then scholars will have autonomy to explore specific approaches within that framework.
Expect weekly touchpoints to ensure progress and refine directions.
If mentees have particular ideas they're excited about that they see as fitting within the scope of the team's work, they're welcome to propose them, but there is no guarantee they will be selected
Conceptual research on deceptive alignment, designing scheming propensity evaluations and honeypots. The stream will run in person in London, with scholars working together in team(s).
During the program, we will meet once a week to go through any updates / results, and your plans for the next week. I'm also happy to comment on docs, respond on Slack, or have additional ad hoc meetings as needed.
I will talk through project ideas with scholars
MATS Research phase provides scholars with a community of peers.

Scholars work out of a shared office and are supported by the Community Team.
MATS alumni report that the connections with peers that they made during MATS have had the largest impact on them years later. Our full-time Community Team works to facilitate these connections and also provide general well-being support. Weekly lightning talks, scholar-led discussion groups, game nights, and outings to SF are some examples of MATS events.