AMLab | Amsterdam Machine Learning Lab

Babak Esmaeili

PhD candidate (advised by J.W. van de Meent)
AMLab
Informatics Institute
University of Amsterdam
Science Park, Lab 42, L4.22

Personal page Google scholar Github Twitter

I am a PhD student at the Amsterdam Machine Learning Lab (AMLab) supervised by Jan-Willem van de Meent. Before September 2021, I was a PhD student at the Khoury College of Computer Science.

I am interested in deep generative models and how we can guide them towards learning representations that are useful for downstream tasks.

Selected Publications

NeurIPS

Nested Variational Inference

Zimmermann, Heiko, Wu, Hao, Esmaeili, Babak, and Meent, Jan-Willem

In 35th Conference on Neural Information Processing Systems (NeurIPS) Dec 2021

Abs PDF

We develop nested variational inference (NVI), a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting. NVI is applicable to many commonly-used importance sampling strategies and provides a mechanism for learning intermediate densities, which can serve as heuristics to guide the sampler. Our experiments apply NVI to (a) sample from a multimodal distribution using a learned annealing path (b) learn heuristics that approximate the likelihood of future observations in a hidden Markov model and (c) to perform amortized inference in hierarchical deep generative models. We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
ICML

Conjugate Energy-Based Models

Wu*, Hao, Esmaeili*, Babak, Wick, Michael, Tristan, Jean-Baptiste, and Van De Meent, Jan-Willem

In Proceedings of the 38th International Conference on Machine Learning 18–24 jul 2021

Abs PDF

In this paper, we propose conjugate energy-based models (CEBMs), a new class of energy-based models that define a joint density over data and latent variables. The joint density of a CEBM decomposes into an intractable distribution over data and a tractable posterior over latent variables. CEBMs have similar use cases as variational autoencoders, in the sense that they learn an unsupervised mapping from data to latent variables. However, these models omit a generator network, which allows them to learn more flexible notions of similarity between data points. Our experiments demonstrate that conjugate EBMs achieve competitive results in terms of image modelling, predictive power of latent space, and out-of-domain detection on a variety of datasets.
AISTATS

Rate-Regularization and Generalization in Variational Autoencoders

Bozkurt*, Alican, Esmaeili*, Babak, Tristan, Jean-Baptiste, Brooks, Dana, Dy, Jennifer, and Meent, Jan-Willem van de

In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics Mar 2021

Abs PDF

Variational autoencoders (VAEs) optimize an objective that comprises a reconstruction loss (the distortion) and a KL term (the rate). The rate is an upper bound on the mutual information, which is often interpreted as a regularizer that controls the degree of compression. We here examine whether inclusion of the rate term also improves generalization. We perform rate-distortion analyses in which we control the strength of the rate term, the network capacity, and the difficulty of the generalization problem. Lowering the strength of the rate term paradoxically improves generalization in most settings, and reducing the mutual information typically leads to underfitting. Moreover, we show that generalization performance continues to improve even after the mutual information saturates, indicating that the gap on the bound (i.e. the KL divergence relative to the inference marginal) affects generalization. This suggests that the standard spherical Gaussian prior is not an inductive bias that typically improves generalization, prompting further work to understand what choices of priors improve generalization in VAEs.
AISTATS

Structured Disentangled Representations

Esmaeili, Babak, Wu, Hao, Jain, Sarthak, Bozkurt, Alican, Siddharth, N., Paige, Brooks, Brooks, Dana H., Dy, Jennifer, and Meent, Jan-Willem

In The 22nd International Conference on Artificial Intelligence and Statistics Apr 2019

Abs PDF

Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. We derive this objective as a generalization of the evidence lower bound, which allows us to explicitly represent the trade-offs between mutual information between data and representation, KL divergence between representation and prior, and coverage of the support of the empirical data distribution. Experiments on a variety of datasets demonstrate that our objective can not only disentangle discrete variables, but that doing so also improves disentanglement of other variables and, importantly, generalization even to unseen combinations of factors.
AISTATS

Structured Neural Topic Models for Reviews

Esmaeili, Babak, Huang, Hongyi, Wallace, Byron, and Meent, Jan-Willem van de

In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics Apr 2019

Abs PDF

We present Variational Aspect-based Latent Topic Allocation (VALTA), a family of autoencoding topic models that learn aspect-based representations of reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a structured manner by inferring an aspect assignment for each sentence in a given review, where the per-aspect topic weights obtained by the user-item encoder serve to define a mixture over topics, conditioned on the aspect. The result is an autoencoding neural topic model for reviews, which can be trained in a fully unsupervised manner to learn topics that are structured into aspects. Experimental evaluation on large number of datasets demonstrates that aspects are interpretable, yield higher coherence scores than non-structured autoencoding topic model variants, and can be utilized to perform aspect-based comparison and genre discovery.