LawZero

We are excited to supervise projects:

Study the causes and implications of (multi-agent) situational awareness;
Contribute to LawZero's Scientist AI, in the form of contextualization and uncertainty estimation.

Apply

View all streams

Stream overview

We are especially interested in supervising projects about:

(1) (Multi-agent) Situational awareness. Language models can recognize data-agnostic, a-semantic perturbations to their activations; when they fail to do so, they can learn, in context, to discriminate the two. Moreover, they can (in-context learn to) identify, e.g., the magnitude / layer at which a perturbation occurs, often generalizing to unseen examples. We want to study (i) the causes of these abilities, (ii) the extent to which they constitute, or extend to, epistemic privilege, and (iii) whether they facilitate predicting, by similarity, how other agents react to one's actions.

(2) Contextualization, a way of pre-processing data to situate statements in context, turning potentially unqualified claims into truth-apt claims. For a sketch of how we envision this being useful for safety, see this blogpost, which details its role in LawZero’s project of building intelligent systems without the capacity for goals [12]---such systems might in turn serve as guardrails [13]. In particular, contextualization decomposes a corpus of text into statements with attribution sources [14], and we are interested in testing whether this helps mitigate preference biases in the presence of unreliable agent-generated data.

(3) Uncertainty estimation for partially trained models. We are studying ensembles, epistemic neural networks, calibration, conformal prediction techniques, and other methods in synthetic environments. We are especially interested in projects that use amortized inference methods (such as GFlowNets [15, 16, 17]) to approximate posteriors over latent variables, such as (i) sources or (ii) predictors behind an autoregressive model, such that predictive uncertainty can be estimated from learned distributions, as opposed to single-point estimates.

Mentors

Yoshua Bengio

Mila

Co-President and Scientific Director (LawZero) / Full Professor (UdeM) / Founder and Scientific Advisor (Mila)

Montreal

—

Agent Foundations

Monitoring

Control

Red-Teaming

Scalable Oversight

Yoshua Bengio is Full Professor of Computer Science at Université de Montreal, Co-President and Scientific Director of LawZero, as well as the Founder and Scientific Advisor of Mila. He also holds a Canada CIFAR AI Chair. Considered one of the world’s leaders in Artificial Intelligence and Deep Learning, he is the recipient of the 2018 A.M. Turing Award, considered to be the "Nobel Prize of computing." He is the most cited computer scientist worldwide, and the most-cited living scientist across all fields (by total citations).

Professor Bengio is a Fellow of both the Royal Society of London and Canada, an Officer of the Order of Canada, a Knight of the Legion of Honor of France, a member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology, and chairs the International AI Safety Report.

Damiano Fornasiere

LawZero

Senior AI safety research scientist

Montreal

—

Agent Foundations

Monitoring

Control

Red-Teaming

Scalable Oversight

Damiano is a research scientist at LawZero, where he works on (i) the maths behind the Scientist AI and (ii) interpretability and evaluation techniques for situational awareness and introspection.

Oliver Richardson (Oli)

LawZero

Senior ML Research Scientist (LawZero) / Postdoctoral Fellow (UdeM)

Montreal

—

Agent Foundations

Monitoring

Control

Red-Teaming

Scalable Oversight

OIi(ver) is a computer scientist (a staff member at LawZero and postdoc under Yoshua Bengio) with unusually broad scientific and mathematical expertise.

He is a sucker for pretty demos and grand unifying theories—unfortunately, sometimes losing sight of what is practical. Over the last few years (i.e., during his PhD at Cornell), Oli has discovered a beautiful theory describing how a great deal of artificial intelligence, classical and modern, can be fruitfully understood as resolving a natural information-theoretic measure of epistemic inconsistency. There remain many unanswered questions, but the hope is that this already much clearer view can lead to powerful generalist AI systems that are safer because they fundamentally do not meaningfully have goals or desires.

Mirko Bronzi

LawZero

Senior Applied Research Scientist

—

Agent Foundations

Dangerous Capability Evals

Monitoring

Control

Red-Teaming

Mirko is a research scientist with over 15 years of experience in Machine Learning and Natural Language Processing, specializing in applied research across diverse industries.

His expertise spans Python, Pytorch, TensorFlow, and NLP frameworks like Hugging Face, enabling him to drive impactful machine learning solutions for businesses.

Mirko is passionate about optimizing deep learning model performance and advancing software engineering best practices in deep learning projects.

Mentees will be assigned a primary mentor and a secondary mentor, such that mentorship w.r.t. both research and engineering is covered.
We provide at least 1h meeting / week with both mentors and, typically, daily availability of both mentors on email / Slack during workdays.
The independence of the scholar depends on the scholar's experience. A priori, we do not expect research independence, but implementation independence roughly comparable to a CS grad student.

Fellows we are looking for

Essential knowledge:

Foundations of machine- and deep-learning;
Transformer architecture and large language models;
Empirical AI safety literature (e.g., evaluations, guardrails, interpretability, …).

Essential experience:

Python;
Designing and implementing machine learning workflows using PyTorch;
Supervised- or RL-fin tuning of language models, at least with toy experiments and some publicly available datasets;
Prompt engineering.

Desired experience:

Experience with libraries such as vLLM, TRL, Hugging Face;
Familiarity with statistical hypothesis testing.

Bonus:

Async APIs;
Multi-gpu training.

We encourage but do not necessarily expect collaborations between mentees in our stream (certain projects can be pursued in parallel, while others benefit from people working together).
We are open to collaborations with MATS' scholars belonging to other streams, if vetted by the mentor(s).
We encourage collaborations with people inside LawZero, subject to respecting first-mentorship (as per MATS' mentorship policy).

Project selection

The stream's mentors will propose and present the projects during week 1.
The mentee(s) will engage with the projects during week(s) 1 or 2 (e.g., reading the literature, replicating a paper, building a demo / MVP).
At the end of week 2 at the latest, the mentor(s) and mentee(s) will agree together on a project, which will be chosen according to interest, feasibility (the ideal proxy-goal is to publish a ML conference paper), state of the literature and relevance to AI safety, and expertise of the mentors and mentees.
Projects may be re-assigned in exceptional circumstances, for example if a discovery suggests a sudden steer in the research direction.

LawZero

Stream overview

Mentors

Mentorship style

Fellows we are looking for

Project selection