You are all cordially invited to the AMLab seminar on Tuesday April 4 at 16:00 in C3.163, where Tineke Blom will give a talk titled “Causal Discovery in the Presence of Measurement Error”. Afterwards there are the usual drinks and snacks!
Abstract: Causal discovery algorithms can predict causal relationships based on several assumptions, which include the absence of measurement error. However, this assumption is most likely violated in practical applications, resulting in erroneous, irreproducible results. In this work, we examine the effect of different types of measurement error in a linear model of three variables, which is a minimal example of an identifiable causal relationship. We show that the presence of unknown measurement error makes it impossible to detect independences between the actual variables from the data using regular statistical testing and conventional thresholds for (in)dependence. We show that for limited measurement error, we can obtain consistent causal predictions by allowing for a small amount of dependence between (conditionally) independent variables. We illustrate our results in both simulated and real world protein-signaling data.
You are all cordially invited to the AMLab seminar on Tuesday March 21 at 16:00 in C3.163, where Frederick Eberhardt (Caltech) will give a talk titled “Causal Macro Variables”. Afterwards there are the usual drinks and snacks!
Abstract: Standard methods of causal discovery take as input a statistical data set of measurements of well-defined causal variables. The goal is then to determine the causal relations among these variables. But how are these causal variables identified or constructed in the first place? Often we have sensor level data but assume that the relevant causal interactions occur at a higher scale of aggregation. Sometimes we only have aggregate measurements of causal interactions at a finer scale. I will present recent work on a framework and method for the construction and identification of causal macro-variables that ensures that the resulting causal variables have well-defined intervention distributions. We have applied this approach to large scale climate data, for which we were able to identify the macro-phenomenon of El Nino using an unsupervised method on micro-level sea surface temperature and wind measurements over the equatorial Pacific.
You are all cordially invited to the AMLab seminar on Tuesday March 14 at 16:00 in C3.163, where Taco Cohen will give a talk titled “Group Equivariant & Steerable CNNs”. Afterwards there are the usual drinks and snacks!
Abstract: Deep learning can be very effective, but typically requires large amounts of labelled data, which can be costly to collect. This is not only a major practical limitation to the applicability of deep learning, but also a fundamental barrier to AI: rapid learning is an essential part of intelligence.
In this talk I will present group equivariant networks, a natural generalization of convolutional networks that achieves improved statistical efficiency by exploiting symmetries like rotation and reflection. Instead of using convolutions, these networks use group equivariant convolutions. Group equivariant convolutions are easy to use, fast, and can be converted to standard convolutions after training. We show that simply replacing translational convolutions with group equivariant convolutions can improve image classification results. In the second part of the talk I will show how group equivariant nets can be scaled up to very large symmetry groups using steerable filters.
You are all cordially invited to the AMLab seminar on Tuesday March 7 at 16:00 in C3.163, where Karen Ullrich will give a talk titled “Soft Weight-Sharing for Neural Network Compression”. Afterwards there are the usual drinks and snacks!
Abstract: The success of deep learning in numerous application domains created the desire to run and train them on mobile devices. This however, conflicts with their computationally, memory and energy intense nature, leading to a growing interest in compression. Recent work by Han et al. (2015a) propose a pipeline that involves retraining, pruning and quantization of neural network weights, obtaining state-of-the-art compression rates. In this paper, we show that competitive compression rates can be achieved by using a version of ”soft weight-sharing” (Nowlan & Hinton, 1992). Our method achieves both quantization and pruning in one simple (re-)training procedure. This point of view also exposes the relation between compression and the minimum description length (MDL) principle.
You are all cordially invited to the AMLab seminar on Tuesday February 28 at 16:00 in C3.163, where ChangYong Oh will give a talk titled “High dimensional Bayesian Optimization”. Afterwards there are the usual drinks and snacks!
Abstract: Bayesian optimization has been successful in many hyper-parameter optimization problems and reinforcement learning problems. Still, there are many obstacles which prevent it from being extensively applied. Among many obstacles, we focused on the methods for high dimensional spaces. In order to resolve the difficulties of high dimensional Bayesian optimization problems, we devised a principled method to reduce the predictive variance of Gaussian process and other assistive methods for its successful application.
Firstly, brief explanation about general Bayesian optimization will be given. Secondly, I will explain the sources that make high dimensional problems harder, namely, ‘boundary effect’ and ‘hollow ball problem’. Thirdly, I will propose solutions to those problem, so-called, ‘variance reduction’ and ‘adaptive search region’.
You are all cordially invited to the AMLab seminar on Tuesday February 21 at 16:00 in C3.163, where Artem Grotov will give a talk titled “Deep Counterfactual Learning”. Afterwards there are the usual drinks and snacks!
Abstract: Deep learning is increasingly important for training interactive systems such as search engines and recommenders. They are applied to a broad range of tasks, including ranking, text similarity, and classification. Training neural network to perform classification requires a lot of labeled data. While collecting large supervised labeled data sets is expensive and sometimes impossible, for example for personalized tasks, there often is an abundance of logged data collected from user interactions with an existing system. This type of data is called logged bandit feedback and utilizing it is challenging because such data is noisy, biased and incomplete. We propose a learning method, Constrained Conterfactual Risk Minimisation (CCRM), based on counterfactual risk minimization of empirical Bernstein bound to tackle this problem and learn from logged bandit feedback. We evaluate CCRM on an image classification task. We find that CCRM performs well in practice and outperforms existing methods.
You are all cordially invited to the AMLab seminar on Tuesday February 14 at 16:00 in C3.163, where Jakub Tomczak will give a talk titled “Improving Variational Auto-Encoders using volume-preserving flows: A preliminary study”. Afterwards there are the usual drinks and snacks!
Abstract: Variational auto-encoders (VAE) are scalable and powerful generative models. However, the choice of the variational posterior determines tractability and flexibility of the VAE. Commonly, latent variables are modeled using the normal distribution with a diagonal covariance matrix. This results in computational efficiency but typically it is not flexible enough to match the true posterior distribution. One fashion of enriching the variational posterior distribution is application of normalizing flows, i.e., a series of invertible transformations to latent variables with a simple posterior. Application of general normalizing flows requires calculating the Jacobian-determinant that could be computationally troublesome. However, it is possible to design a series of transformations for which the Jacobian-determinant equals 1, so called volume-preserving flows. During the presentation I will describe my preliminary results on new volume-preserving flow called Householder flow and an extension of the linear Inverse Autoregressive Flow.
You are all cordially invited to the AMLab seminar on Tuesday February 7 at 16:00 in C3.163, where Thijs van Ommen will give a talk titled “Recognizing linear structural equation models from observational data”. Afterwards there are the usual drinks and snacks!
Abstract: In a linear structural equation model, each variable is a linear function of other variables plus noise, and some noise terms may be correlated. Such a model can be represented by a mixed graph, with directed edges for causal relations and bidirected edges for correlated noise terms. Our goal is to learn the graph structure from observational data. To do this, we need to consider what constraints a model imposes on the observed covariance matrix. Some of these constraint do not correspond to (conditional) independences, and are not well understood. In particular, it is not even clear how to tell, by looking at two graphs, whether they impose exactly the same constraints. I will describe my progress in mapping out these models and their constraints.
You are all cordially invited to the AMLab seminar on Tuesday January 31 at 16:00 in C3.163, where Max Welling will give a talk titled “AMLAB/QUVA’s progress in Deep Learning”. Afterwards there are the usual drinks and snacks!
Abstract: I will briefly describe the progress that has been made in the past year in AMLAB and QUVA in terms of deep learning. I will try to convey a coherent story of how some of these projects tie together into a bigger vision for the field. I will end with research questions that seem interesting for future projects.
You are all cordially invited to the AMLab seminar on Tuesday January 24 at 16:00 in C3.163, where Marco Loog will give a talk titled “Semi-Supervision, Surrogate Losses, and Safety Guarantees”. Afterwards there are the usual drinks and snacks!
Abstract: Users of classification tools tend to forget [or worse, might not even realize] that classifiers typically do not minimize the 0-1 loss, but a surrogate that upperbounds the classification error on the training set. Here we argue that we should also study these losses as such and we consider the problem of semi-supervised learning from this angle. In particular, we look at the basic setting of linear classifiers and convex margin-based losses, e.g. hinge, logistic, squared, etc. We investigate to what extent semi-supervision can be safe at least on the training set, i.e., we want to construct semi-supervised classifiers for which the empirical risk is never larger than the risk achieved by their supervised counterparts. [Based on work carried out together with Jesse Krijthe; see https://arxiv.org/abs/1612.08875 and https://arxiv.org/abs/1503.00269].