S03E04: The one with Flavio Calmon talking about Information-theoretic Tools for Machine Learning

Показать описание

Title: On representations and fairness: information-theoretic tools for machine learning

Abstract: Information theory can shed light on the algorithm-independent limits of learning from data and serve as a design driver for new machine learning algorithms. In this talk, we discuss a set of information-theoretic tools that can be used to (i) help understand fairness and discrimination in machine learning and (ii) characterize data representations learned by complex learning models. On the fairness side, we explore how local perturbations of distributions can help both identify proxy features for discrimination, and how a formulation inspired by information projection can be applied to repair models for bias. On the representation learning side, we explore a theoretical tool called principal inertia components (PICs), which enjoy a long history in the statistics and information theory literature. We use the PICs to scale-up a multivariate statistical tool called correspondence analysis (CA) using neural networks, enabling data dependencies to be visualized and interpreted at a large scale. We illustrate these techniques in both synthetic and real-world datasets, and discuss future research directions.