The stream will focus on conceptual, empirical, and theoretical work on scalable oversight and control. This includes but is not limited to creating model organisms for specific failure modes, designing training procedures against them, and making progress on subproblems involved in safety cases.
Shi Feng leads a research group working on oversight and control. He is an assistant professor at George Washington University. Prior to that, he was a postdoc in the NYU Alignment Research Group under Sam Bowman. He currently focuses on deception and collusion, with an emphasis on propensity and evaluation realism.
Scholars will collaborate with people involved in the group but can also find new collaborators.
A research agenda document will be shared ahead of time with a short list of project ideas. The scholars can also brainstorm and pitch ideas that are aligned with the research agenda. We will decide on assignments in week 2.