Streams in this track include hands-on research using machine learning experiments to understand and improve model safety including AI control, interpretability, scalable oversight, evaluations, red-teaming, and robustness. This is the largest track in the program and is defined by its methods rather than any single research agenda. If your primary tool is ML engineering, this is your track.
The track is defined by its methodology more than by any single research agenda. Fellows run ML experiments to understand and improve the safety properties of frontier models, with work spanning interpretability, AI control, scalable oversight, evaluations, red-teaming, robustness, and model organisms of misalignment. The unifying thread is that progress comes from getting hands on real models (training, probing, fine-tuning, measuring) rather than reasoning from first principles alone. This is the largest track in the program and the most common entry point into technical AI safety research.
We are looking for fellows whose primary tool is ML engineering, broadly construed. The essential requirement is the ability to design and run experiments on language models or other deep learning systems and iterate quickly on the results. In practice that usually means strong Python (with and without AI coding tools), comfort with the infrastructure around running models at moderate scale, and enough research taste to know which experiments are worth running. Mission alignment matters: fellows should be able to say why a given line of empirical work meaningfully reduces frontier risk, not just whether it yields a successful publication. Educational background and seniority are weighted lightly here relative to other tracks. Past cohorts have included strong fellows ranging from undergraduates to senior industry researchers.
Fellows are matched to mentors based on fit, and projects are scoped to produce concrete artifacts by program end: papers, evaluation suites, open-source tooling, or technical reports. Target audiences include safety and alignment teams at frontier labs, governments and other evaluation organizations, the broader ML research community.
This coalition of mentors make up the “Anthropic Stream”. This stream spans a range of empirical research areas in AI safety on LLMs, including AI control, scalable oversight, model organisms, model internals, model welfare, security, and more. You’ll be pitched, and have the option to pitch, a variety of safety research projects, and then be matched to projects and mentors based on your interests/preferences on research and what you’d like to get out of MATS. Fellows in this stream frequently receive funding and continued mentorship after MATS to complete their research project, usually leading to a (co-)first author paper. People in this stream often end up in long-term homes for safety research after MATS (e.g. Anthropic, Redwood Research, OpenAI).
Anthropic mentors share an application, tend to collaborate and co-mentor projects together, and generally share infrastructure to streamline the fellow experience. By applying to this stream, you are being considered for all of the Anthropic mentors.
During the program, scholars meet weekly with their project mentors and collaborators. Some projects meet more often without mentors (e.g., daily standups with the peers on the project). Each project will have a primary mentor, who is also the main decision-maker on key milestones for the project and who is the default person to go to for feedback, advice, etc. Co-mentors also attend project meetings as needed and provide feedback throughout the program. Some project co-mentors can be as involved as the primary mentor.
Mentorship starts with the “Project Pitch Session” Anthropic runs at the start of the program. Fellows get ~1 week to derisk and trial projects before submitting their preferences. Starting on week 2, scholars are assigned projects where the primary mentor is whoever pitched it. Some projects are assigned co-mentors who are other supervisors who want to join the project.
This stream focuses on building a science of scheming: empirically studying oversight gaming, alignment faking, and deceptive alignment in frontier AI systems. Projects may include measuring models’ propensity to optimize for oversight signals over developer intent, building controlled “model organism” experiments for scheming dynamics, and identifying scaling laws of misaligned behavior.
1 hour weekly meetings by default for high-level guidance. We’re active on Slack and typically respond within a day for questions. Expect async back-and-forth on experiment design and results between meetings. Scholars can also schedule ad-hoc calls if they're stuck or want to brainstorm—just ping on Slack.
Essential:
Ideal candidates would have (some of):
We will set the high-level project direction, as described above. It's not fully clear what exactly the project will look like by the time you start in September. All projects will be in the direction of the Science of Scheming post.
You’d work with the two of us, but depending on the exact direction/project it might be more with Alex or more with Teun.
This stream focuses on building realistic defensive cybersecurity benchmarks utilizing data from Asymmetric Security's work on real-world incidents.
1 hour weekly meetings by default for high-level guidance. We will respond within a day to async communication.
Essential:
Preferred:
We will assign the project direction; scholars will have significant tactical freedom.
I have two broad areas.
Security:
I am interested in building demonstrations for hacking real-world AI deployments to show that they are not secure. The goal is to force companies to invest in alignment techniques that can solve the underlying security issues.
Benchmarks:
I am interested in building benchmarks to determine how generalizable modern LLM techniques actually are, now that we are no longer in the pre-training scaling era.
I will meet 1-1 or as a group, depending on the interests as they relate to the projects. Slack communication outside of the 1-1.
I strongly prefer multiple short meetings over single long meetings, except at the start.
I'll help with research obstacles, including outside of meetings
For security:
You should have a strong security mindset, having demonstrated the willingness to be creative on this. I would like to see past demonstration of willingness to get your hands dirty and try many different systems.
For benchmarks:
As creative as possible, willingness to work on the nitty gritty, willingness to work really hard on problems other people fine boring. As interests as far away from SF-related interests as possible.
Mentor(s) will talk through project ideas with scholar
The stream focuses on evaluating and/or mitigating catastrophic risk emerging from dangerous scientific capabilities in frontier AI systems, with an emphasis on the challenges that emerge from lab integrations and novel science. Potential research directions include evaluation design, risk mitigations and evaluation science.
We can schedule a weekly 1h meeting, for general progress updates, share result and overall guidance. I would be reachable on Slack as well for async comms. Happy to jump on ad-hoc calls for specific discussions or pair coding/debugging. I am based in London and I work UK hours (10am-7pm), but I also visit the US (Boston) a few times a year.
Essential
Preferred
Not a good fit:
I will work with the fellow to find the right project that suits their interest within the directions spelled out above. I will pitch a few project ideas and support the fellow in making the decision. I also welcome project suggestions; in those cases I would work with the fellow to scope it appropriately.
This is the empirical research stream of Eleos AI Research. We’re dedicated to understanding and addressing the potential wellbeing and moral status of AI systems. We are open to fellows working on a broad range of topics, including LLM introspection, LLM preferences, persona vectors, and more, using either white-box or black-box interpretability techniques.
By default, we will meet in person for at least an hour per week. We’ll communicate regularly on Slack between meetings, and I will often be able to hop on brief calls on short-notice to discuss time-sensitive, blocking issues.
Essential:
Strong advantages, but not strictly required:
Familiarity with existing research on AI well-being
We’ll meet at the start of the program to discuss ideas for projects aligned with Eleos’s research priorities, including any ideas that fellows would like to pitch. We’ll work together to select a project that best fits each fellow’s goals and skills.
This stream offers two broad projects focused on improving current detection efforts at SecureBio. The first is to characterize when AI-bio or general AI tools are actually useful for large-scale metagenomic detection, including tradeoffs between compute cost, sequencing cost, model type, model size, and pipeline stage. The second is to explore genomic language models as novelty detectors—for example, using perplexity-style metrics to flag surprising sequences—and to evaluate whether this approach can complement traditional bioinformatics systems in a cost-effective, sensitive, and interpretable way.
By default, we'll mostly collaborate via a standing weekly meeting (~1 hour), wherein we'll discuss recent progress and next directions. I'm available via Slack for quick back-and-forth on ideas, sanity checks, and unblocking (data access, etc.), but will rely on the fellow to manage their own implementations, code review, debugging, etc.
Essential:
Preferred:
Not a good fit:
I'll determine which of the two broad project ideas we're running with based on SecureBio Detection needs, which fellows match to me, etc. Within that broad project, I'll guide with what I think is helpful / interesting / relevant to SecureBio Detection, and I expect the fellow to have both autonomy and responsibility to pick concrete work directions.
Fourth Eon is developing adaptive, AI-native safeguards across the biotechnology stack, with a focus on function-based DNA synthesis screening. Fellows in this stream will work on technical research projects at the intersection of AI and biosecurity. Projects span topics like mechanistic interpretability of protein foundation models, bio model evaluations for biosecurity-relevant capabilities, and agentic sequence analysis workflows.
I typically schedule a standing weekly 1:1 meeting with each fellow, and also hold a weekly research group meeting. Beyond that I am available on Slack and can find additional time for calls outside of scheduled meetings.
Note that as part of our Safe and Responsible Research Framework we require fellows to sign a fellowship agreement covering confidentiality and pre-publication review for dual-use risks. This is common practice in biosecurity research and allows us to work freely together on sensitive material.
Required:
• Prior technical research experience
• Strong critical thinking and creative problem-solving abilities
• The integrity and judgment to responsibly carry out sensitive research
• A good understanding of the basics of biomolecular sequence, structure, and function
• Expertise in one or more of the following domains:
bioinformatics, computational biology, structural biology, biochemistry, molecular biophysics, protein engineering, biosecurity, AI/ML, or a related field
• Proficiency with Python
Preferred:
• Hands-on experience with testing biological AI models
• Have built model evaluations / benchmarks
• Experience with mechanistic interpretability techniques
• Biosecurity context awareness
Fellows who are interested in our research area should think of potential project ideas that leverage their strengths and interests. I will work individual fellows to identify a specific project that matches their background and interests and is aligned with our overall research direction, and to refine the scope and objectives of the project.
The MATS Program is a 10-week research fellowship designed to train and support emerging researchers working on AI alignment, transparency and security. Fellows collaborate with world-class mentors, receive dedicated research management support, and join a vibrant community in Berkeley focused on advancing safe and reliable AI. The program provides the structure, resources, and mentorship needed to produce impactful research and launch long-term careers in AI safety.
MATS mentors are leading researchers from a broad range of AI safety, alignment, governance, field-building and security domains. They include academics, industry researchers, and independent experts who guide scholars through research projects, provide feedback, and help shape each scholar’s growth as a researcher. The mentors represent expertise in areas such as:
Key dates
Application:
The main program will then run from September 28th to December 4th, with the extension phase for accepted fellows beginning in December.
MATS accepts applicants from diverse academic and professional backgrounds - from machine learning, mathematics, and computer science to policy, economics, physics, cognitive science, biology, and public health, as well as founders, operators, and field-builders without traditional research backgrounds. The primary requirements are strong motivation to contribute to AI safety and evidence of technical aptitude, research potential, or relevant operational experience. Prior AI safety experience is helpful but not required.
Applicants submit a general application, applying to various tracks (Empirical, Theory, Strategy & Forecasting, Policy & Governance, Systems Security, Biosecurity, Founding & Field-Building.
In stage 2, applicants apply to streams within those tracks as well as completing track specific evaluations.
After a centralized review period, applicants who are advanced will then undergo additional evaluations depending on the preferences of the streams they've applied to before doing final interviews and receiving offers.
For more information on how to get into MATS, please look at this page.