Stochastic Bandits: Foundations and Current Perspectives

preview_player
Показать описание
Shipra Agrawal (Columbia University)
Data-Driven Decision Processes Boot Camp

This talk will focus on the main algorithms for stochastic bandits, a fundamental model for sequential learning that assumes that rewards of different actions come identically and independently from fixed distributions. We will cover the main algorithms for stochastic bandits (Upper Confidence Bound and Thompson Sampling) and subsequently discuss how they can be adapted to incorporate various additional constraints.
Рекомендации по теме