Petri: An open-source auditing tool to accelerate AI safety research

MATS Fellow:

Isha Gupta

Authors:

Kai Fronsdal, Isha Gupta, Abhay Sheshadri, Jonathan Michala, Stephen McAleer, Rowan Wang, Sara Price, Samuel R. Bowman

Citations

0 Citations

Abstract:

We're releasing Petri (Parallel Exploration Tool for Risky Interactions), an open-source framework for automated auditing that uses AI agents to test the behaviors of target models across diverse scenarios. When applied to 14 frontier models with 111 seed instructions, Petri successfully elicited a broad set of misaligned behaviors including autonomous deception, oversight subversion, whistleblowing, and cooperation with human misuse. The tool is available now at github.com/safety-research/petri.

Recent research

What Should Frontier AI Developers Disclose About Internal Deployments?

Authors:

Jacob Charnock, Raja Moreno, Justin Miller, William L. Anderson

Date:

April 24, 2026

Citations:

Where is the Mind? Persona Vectors and LLM Individuation

Authors:

Pierre Beckmann

Date:

April 20, 2026

Citations:

Petri: An open-source auditing tool to accelerate AI safety research

Recent research

Frequently asked questions