You are all cordially invited to the AMLab seminar on Tuesday February 21 at 16:00 in C3.163, where Artem Grotov will give a talk titled “Deep Counterfactual Learning”. Afterwards there are the usual drinks and snacks!
Abstract: Deep learning is increasingly important for training interactive systems such as search engines and recommenders. They are applied to a broad range of tasks, including ranking, text similarity, and classification. Training neural network to perform classification requires a lot of labeled data. While collecting large supervised labeled data sets is expensive and sometimes impossible, for example for personalized tasks, there often is an abundance of logged data collected from user interactions with an existing system. This type of data is called logged bandit feedback and utilizing it is challenging because such data is noisy, biased and incomplete. We propose a learning method, Constrained Conterfactual Risk Minimisation (CCRM), based on counterfactual risk minimization of empirical Bernstein bound to tackle this problem and learn from logged bandit feedback. We evaluate CCRM on an image classification task. We find that CCRM performs well in practice and outperforms existing methods.