Vous êtes ici


Controlling complexity by sharing parameters and minimizing variation

Doctoral dissertation
McGill University
supervised learning; deep learning; regularization
Résumé :

In this thesis we investigate several variants of online learning problems with different feedback models and objectives. First we consider the pure exploration problem with multi-action probes. We design algorithms that can find the best one or several actions with high probability while using as few probes as possible. Then we study the side observation model in the regret minimization scenario. We derive a novel finite time distribution dependent lower bound and design asymptotically optimal and minimax optimal algorithms. Last we investigate the conservative bandit problem where the objective is to minimize the regret while maintaining the cumulative reward above a baseline. We design algorithms for several variants of the problem and derive a lower bound.

Suivez-nous sur les médias sociaux:

twitter  twitter  LinkedIN



2013 AI Doctoral and MSC thesis awards


33e Conférence canadienne sur l'intelligence artificielle

12-15 mai 2020, Ottawa, Ontario



CAIAC Messager

Édition de novembre 2019 maintenant en ligne!

Connexion utilisateur