filmov
tv
Reliability and interactive debugging for language models

Показать описание
Abstract: Large language models have permeated our everyday lives and are used in critical decision making scenarios that can affect millions of people. Despite their impressive progress, model deficiencies may result in exacerbating harmful biases or lead to catastrophic failures. In this talk, I discuss several important considerations for reliable model deployment that engender user trust. Beyond improved accuracy on new and complex tasks, users want more transparent models that better explain their predictions and are robust to data biases or annotation artifacts. They also want to be equipped to interact with these models in order to better understand and debug them. I will discuss a range of training and inference techniques towards building these aspects of reliability into models, transitioning from classifier models with limited interaction potential to massive language models that can converse with humans and communicate with external tools. I will describe a new optimization technique that discovers error-prone data slices that users can examine, and trains a robust classifier to improve performance on biased data. I will then discuss a prompt-based approach to explain model predictions on common-sense reasoning tasks, that can also be leveraged by users to probe model behavior. Finally, I will present a framework for automatically decomposing unseen composite tasks that require multi-step reasoning and use of external tools, and delve into how the framework supports user debugging. I will also discuss some future directions for building reliability into interactive and personalized language models.
Bio: Bhargavi is a PhD candidate in the natural language processing group at University of Washington, where she is advised by Hannaneh Hajishirzi and Luke Zettlemoyer. She also holds a Masters in language technology from Carnegie Mellon University. Her research focuses on interpretability, robustness, and complex reasoning and tool use (e.g. APIs, search, code) for language models. During her PhD, she has interned at Meta AI Research, Google AI Research, Microsoft Research, and Allen Institute for AI.
Bio: Bhargavi is a PhD candidate in the natural language processing group at University of Washington, where she is advised by Hannaneh Hajishirzi and Luke Zettlemoyer. She also holds a Masters in language technology from Carnegie Mellon University. Her research focuses on interpretability, robustness, and complex reasoning and tool use (e.g. APIs, search, code) for language models. During her PhD, she has interned at Meta AI Research, Google AI Research, Microsoft Research, and Allen Institute for AI.