filmov
tv
From Mechanistic Interpretability to Mechanistic Reasoning - Antoine Bosselut
Показать описание
Abstract
Pretrained language models (LMs) encode implicit representations of knowledge in their parameters. Despite this observation, our best methods for interpreting these representations yield few actionable insights on how to manipulate this parameter space for downstream benefit. In this talk, I will present work on methods that simulate machine reasoning by localizing and modifying parametric knowledge representations. First, I will present a method for discovering knowledge-critical subnetworks within pretrained language models, and show that these sparse computational subgraphs are responsible for the model’s ability to encode specific pieces of knowledge. Then, I will present a new reasoning algorithm, RECKONING, a bi-level optimisation procedure that dynamically encodes and reasons over new knowledge at test time using the model’s existing learned knowledge representations as a scratchpad. Finally, I will discuss the next steps and challenges in using internal model mechanisms for reasoning.
Bio
Antoine Bosselut is an assistant professor in the School of Computer and Communication Sciences at the École Polytechnique Fédéral de Lausanne (EPFL). He was a postdoctoral scholar at Stanford University and a Young Investigator at the Allen Institute for AI (AI2). He completed his PhD at the University of Washington and was a student researcher at Microsoft Research. His research interests are in building systems that mix knowledge and language representations to solve problems in NLP, specializing in commonsense representation and reasoning.
Pretrained language models (LMs) encode implicit representations of knowledge in their parameters. Despite this observation, our best methods for interpreting these representations yield few actionable insights on how to manipulate this parameter space for downstream benefit. In this talk, I will present work on methods that simulate machine reasoning by localizing and modifying parametric knowledge representations. First, I will present a method for discovering knowledge-critical subnetworks within pretrained language models, and show that these sparse computational subgraphs are responsible for the model’s ability to encode specific pieces of knowledge. Then, I will present a new reasoning algorithm, RECKONING, a bi-level optimisation procedure that dynamically encodes and reasons over new knowledge at test time using the model’s existing learned knowledge representations as a scratchpad. Finally, I will discuss the next steps and challenges in using internal model mechanisms for reasoning.
Bio
Antoine Bosselut is an assistant professor in the School of Computer and Communication Sciences at the École Polytechnique Fédéral de Lausanne (EPFL). He was a postdoctoral scholar at Stanford University and a Young Investigator at the Allen Institute for AI (AI2). He completed his PhD at the University of Washington and was a student researcher at Microsoft Research. His research interests are in building systems that mix knowledge and language representations to solve problems in NLP, specializing in commonsense representation and reasoning.