You are all cordially invited to the AMLab seminar on Tuesday February 28 at 16:00 in C3.163, where ChangYong Oh will give a talk titled “High dimensional Bayesian Optimization”. Afterwards there are the usual drinks and snacks!
Abstract: Bayesian optimization has been successful in many hyper-parameter optimization problems and reinforcement learning problems. Still, there are many obstacles which prevent it from being extensively applied. Among many obstacles, we focused on the methods for high dimensional spaces. In order to resolve the difficulties of high dimensional Bayesian optimization problems, we devised a principled method to reduce the predictive variance of Gaussian process and other assistive methods for its successful application.
Firstly, brief explanation about general Bayesian optimization will be given. Secondly, I will explain the sources that make high dimensional problems harder, namely, ‘boundary effect’ and ‘hollow ball problem’. Thirdly, I will propose solutions to those problem, so-called, ‘variance reduction’ and ‘adaptive search region’.
You are all cordially invited to the AMLab seminar on Tuesday February 21 at 16:00 in C3.163, where Artem Grotov will give a talk titled “Deep Counterfactual Learning”. Afterwards there are the usual drinks and snacks!
Abstract: Deep learning is increasingly important for training interactive systems such as search engines and recommenders. They are applied to a broad range of tasks, including ranking, text similarity, and classification. Training neural network to perform classification requires a lot of labeled data. While collecting large supervised labeled data sets is expensive and sometimes impossible, for example for personalized tasks, there often is an abundance of logged data collected from user interactions with an existing system. This type of data is called logged bandit feedback and utilizing it is challenging because such data is noisy, biased and incomplete. We propose a learning method, Constrained Conterfactual Risk Minimisation (CCRM), based on counterfactual risk minimization of empirical Bernstein bound to tackle this problem and learn from logged bandit feedback. We evaluate CCRM on an image classification task. We find that CCRM performs well in practice and outperforms existing methods.
You are all cordially invited to the AMLab seminar on Tuesday February 14 at 16:00 in C3.163, where Jakub Tomczak will give a talk titled “Improving Variational Auto-Encoders using volume-preserving flows: A preliminary study”. Afterwards there are the usual drinks and snacks!
Abstract: Variational auto-encoders (VAE) are scalable and powerful generative models. However, the choice of the variational posterior determines tractability and flexibility of the VAE. Commonly, latent variables are modeled using the normal distribution with a diagonal covariance matrix. This results in computational efficiency but typically it is not flexible enough to match the true posterior distribution. One fashion of enriching the variational posterior distribution is application of normalizing flows, i.e., a series of invertible transformations to latent variables with a simple posterior. Application of general normalizing flows requires calculating the Jacobian-determinant that could be computationally troublesome. However, it is possible to design a series of transformations for which the Jacobian-determinant equals 1, so called volume-preserving flows. During the presentation I will describe my preliminary results on new volume-preserving flow called Householder flow and an extension of the linear Inverse Autoregressive Flow.
You are all cordially invited to the AMLab seminar on Tuesday February 7 at 16:00 in C3.163, where Thijs van Ommen will give a talk titled “Recognizing linear structural equation models from observational data”. Afterwards there are the usual drinks and snacks!
Abstract: In a linear structural equation model, each variable is a linear function of other variables plus noise, and some noise terms may be correlated. Such a model can be represented by a mixed graph, with directed edges for causal relations and bidirected edges for correlated noise terms. Our goal is to learn the graph structure from observational data. To do this, we need to consider what constraints a model imposes on the observed covariance matrix. Some of these constraint do not correspond to (conditional) independences, and are not well understood. In particular, it is not even clear how to tell, by looking at two graphs, whether they impose exactly the same constraints. I will describe my progress in mapping out these models and their constraints.