Hi everyone, You are all cordially invited to the AMLab Seminar on Thursday 29th October at 2:00 p.m CET on Zoom, where Andy Keller will give a talk titled “Self Normalizing Flows “. (Note that the time slot for this talk is modified a bit, 2 hours advanced than previous ones, it will be appreciated if you can save this in your calendar.)
Title: Self Normalizing Flows
Abstract: Efficient gradient computation of the Jacobian determinant term is a core problem of the normalizing flow framework. Thus, most proposed flow models either restrict to a function class with easy evaluation of the Jacobian determinant, or an efficient estimator thereof. However, these restrictions limit the performance of such density models, frequently requiring significant depth to reach desired performance levels. In this work, we propose Self Normalizing Flows, a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer. This reduces the computational complexity of each layer’s exact update from O(D^3) to O(D^2), allowing for the training of flow architectures which were otherwise computationally infeasible, while also providing efficient sampling. We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts, while surpassing the performance of their functionally constrained counterparts.
To gain more deep insights into this recently developed normalizing flow, feel free to join and discuss it 🙂 !
Hey, guys~ You are all cordially invited to the AMLab Seminar on Thursday 15th October at 16:00 CEST on Zoom, where Wouter Kool will give a talk titled “Gumbel Mathemagic“.
Title: Gumbel Mathemagic
Abstract: Those who have seen the talk “Stochastic Beams and Where to Find Them” (https://www.facebook.com/icml.imls/videos/895968107420746/) can tune in 20 mins late as I will explain to you the mathemagic behind Stochastic Beam Search, an extension of the Gumbel-Max trick that enables sampling sequences without replacement. After that I will discuss Ancestral-Gumbel-Top-k Sampling, which is a generalization of Stochastic Beam Search. Finally, I will derive a multi-sample REINFORCE estimator with built-in baseline, based on sampling without replacement. All made possible by the humble Gumbel! 🙂 Bring your own snacks!
To gain more deep insights into Gumbel tricks and how to stabilize gradient estimates, feel free to join and discuss it!
Hi everyone, you are all cordially invited to the AMLab Seminar on Thursday 8th October at 16:00 CEST on Zoom, where Eric Nalisnick will give a talk titled ” Specifying Priors on Predictive Complexity “.
Title: Specifying Priors on Predictive Complexity
Abstract: Specifying a Bayesian prior is notoriously difficult for complex models such as neural networks. Reasoning about parameters is made challenging by the high-dimensionality and over-parameterization of the space. Priors that seem benign and uninformative can have unintuitive and detrimental effects on a model’s predictions. For this reason, we propose predictive complexity priors: a functional prior that is defined by comparing the model’s predictions to those of a reference function. Although originally defined on the model outputs, we transfer the prior to the model parameters via a change of variables. The traditional Bayesian workflow can then proceed as usual. We apply our predictive complexity prior to modern machine learning tasks such as reasoning over neural network depth and sharing of statistical strength for few-shot learning.
Link to paper : https://arxiv.org/abs/2006.10801 To gain more deep insights into priors in Bayesian models, feel free to join and discuss it!
Hi everyone, we have a guest speaker Nutan Chen from ARGMAX.AI and you are all cordially invited to the AMLab Seminar on Thursday 1st October at 16:00 CEST on Zoom, where Nutan will give a talk titled ” Distance in Latent Space “.
Title : Distance in Latent Space
Abstract : Measuring the similarity between data points often requires domain knowledge. It can in parts be compensated by relying on unsupervised methods such as latent-variable models, where the similarity/distance is estimated in a more compact latent space. However, deep generative models such as vanilla VAEs are not distance-preserving. Therefore, this type of model is unreliable for tasks such as precise distance measurement or smooth interpolation directly from the latent space. To solve this problem, we proposed novel methods based VAEs to constrain or measure the distance in the latent space.
In the first section of this talk, I will explore a method that embeds dynamic movement primitives into the latent space of a time-dependent VAE framework (deep variational Bayes filters). Experimental results show that our framework generalizes well, e.g., switches between movements or changing goals. Additionally, the distance between two data points that are close in time is constrained, which results in influencing the data structure of the hidden space. In the second section, I will show how we transferred ideas from Riemannian geometry to deep generative models, letting the distance between two points be the shortest path on a Riemannian manifold induced by the transformation. The method yields a principled distance measure, provides a tool for visual inspection of deep generative models, and an alternative to linear interpolation in latent space. In the third section, I will propose an extension to the framework of VAEs that allows learning flat latent manifolds, where the Euclidean metric is a proxy for the similarity between data points. This is achieved by defining the latent space as a Riemannian manifold and by regularizing the metric tensor to be a scaled identity matrix. This results in a computational efficient distance metric which is practical for applications in real-time scenarios.
Paper Link : Learning flat manifold of VAEs. In International Conference on Machine Learning (ICML). 2020.
To gain more deep insights into connections
between VAEs and manifolds and see how these are applied to robotics, feel free
to join and discuss it!
Hi, guys~ We have a guest speaker Abubakar Abid and you are all cordially invited to the AMLab Seminar on Thursday 17th September at 16:00 CEST on Zoom, where Abubakar will give a talk titled ” Interactive UIs for Your Machine Learning Models “.
Title: Interactive UIs for Your Machine Learning
Abstract: Accessibility is a major challenge of machine
learning (ML). Typical ML models are built by specialists and require
specialized hardware/software as well as ML experience to validate. This makes
it challenging for non-technical collaborators and endpoint users (e.g.
physicians) to easily provide feedback on model development and to gain trust
in ML. The accessibility challenge also makes collaboration more difficult and
limits the ML researcher’s exposure to realistic data and scenarios that occur
in the wild. To improve accessibility and facilitate collaboration, we
developed an open-source Python package, Gradio, which allows researchers to
rapidly generate a visual interface for their ML models. Gradio makes accessing
any ML model as easy as opening a URL in your browser. Our development of
Gradio is informed by interviews with a number of machine learning researchers
who participate in interdisciplinary collaborations. We developed these
features and carried out a case study to understand Gradio’s usefulness and
usability in the setting of a machine learning collaboration between a
researcher and a cardiologist.
To gain more deep insights into understanding your machine learning models, feel free to join and discuss it! See you there!
You are all cordially invited to the AMLab Seminar on Thursday 10th September at 16:00 CEST on Zoom, where Elise van der Pol will give a talk titled “MDP Homomorphic Networks for Deep Reinforcement Learning “.
Paper link: https://arxiv.org/pdf/2006.16908.pdf
Title: MDP Homomorphic Networks for Deep Reinforcement Learning
Abstract: This talk discusses MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done.
We construct MDP
homomorphic MLPs and CNNs that are equivariant under either a group of
reflections or rotations. We show that such networks converge faster than
unstructured baselines on CartPole, a grid world and Pong.
To gain more deep insights
on Deep Reinforcement Learning, feel free to join it and discuss!
You are all cordially invited to the AMLab Seminar on Thursday 3rd September at 16:00 CEST on Zoom, where Didrik Nielsen will give a talk titled “SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows”.
Paper link: https://arxiv.org/abs/2007.02731
Title: SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows
Abstract: Normalizing flows and variational autoencoders are powerful generative models that can represent complicated density functions. However, they both impose constraints on the models: Normalizing flows use bijective transformations to model densities whereas VAEs learn stochastic transformations that are non-invertible and thus typically do not provide tractable estimates of the marginal likelihood. In this paper, we introduce SurVAE Flows: A modular framework of composable transformations that encompasses VAEs and normalizing flows. SurVAE Flows bridge the gap between normalizing flows and VAEs with surjective transformations, wherein the transformations are deterministic in one direction — thereby allowing exact likelihood computation, and stochastic in the reverse direction — hence providing a lower bound on the corresponding likelihood. We show that several recently proposed methods, including dequantization and augmented normalizing flows, can be expressed as SurVAE Flows. Finally, we introduce common operations such as the max value, the absolute value, sorting and stochastic permutation as composable layers in SurVAE Flows.
Hi everyone, you are all cordially invited to the AMLab Seminar on Thursday 30th July at 16:00 CEST on Zoom, where Pim de Haan will give a talk titled “Natural Graph Networks“.
Paper link: https://arxiv.org/abs/2007.08349
Title: Natural Graph Networks
Abstract: Conventional neural message passing algorithms are invariant under permutation of the messages and hence forget how the information flows through the network. Studying the local symmetries of graphs, we propose a more general algorithm that uses different kernels on different edges, making the network equivariant to local and global graph isomorphisms and hence more expressive. Using elementary category theory, we formalize many distinct equivariant neural networks as natural networks, and show that their kernels are ‘just’ a natural transformation between two functors. We give one practical instantiation of a natural network on graphs which uses an equivariant message network parameterization, yielding good performance on several benchmarks.
You are all cordially invited to the AMLab seminar on Thursday 2nd July at 16:00 on Zoom, where Emiel Hoogeboomwill give a talk titled “The Convolution Exponential”.
Title:The Convolution Exponential
Paper link: https://arxiv.org/abs/2006.01910
Abstract: We introduce a new method to build linear flows, by taking the exponential of a linear transformation. This linear transformation does not need to be invertible itself, and the exponential has the following desirable properties: it is guaranteed to be invertible, its inverse is straightforward to compute and the log Jacobian determinant is equal to the trace of the linear transformation. An important insight is that the exponential can be computed implicitly, which allows the use of convolutional layers. Using this insight, we develop new invertible transformations named convolution exponentials and graph convolution exponentials, which retain the equivariance of their underlying transformations. In addition, we generalize Sylvester Flows and propose Convolutional Sylvester Flows which are based on the generalization and the convolution exponential as basis change. Empirically, we show that the convolution exponential outperforms other linear transformations in generative flows on CIFAR10 and the graph convolution exponential improves the performance of graph normalizing flows. In addition, we show that Convolutional Sylvester Flows improve performance over residual flows as a generative flow model measured in log-likelihood.
You are all cordially invited to the AMLab seminar on Thursday 25th June at 16:00 on Zoom, where Victor Garcia Satorras will give a talk titled “Neural Enhanced Belief Propagation on Factor Graphs”.
Note: you can access the video afterwards, which will be uploaded to YouTube
Paper link: https://arxiv.org/pdf/2003.01998.pdf
Abstract: A graphical model is a structured representation of locally dependent random variables. A traditional method to reason over these random variables is to perform inference using belief propagation. When provided with the true data generating process, belief propagation can infer the optimal posterior probability estimates in tree structured factor graphs. However, in many cases we may only have access to a poor approximation of the data generating process, or we may face loops in the factor graph, leading to suboptimal estimates. In this work we first extend graph neural networks to factor graphs (FG-GNN). We then propose a new hybrid model that runs conjointly a FG-GNN with belief propagation. The FG-GNN receives as input messages from belief propagation at every inference iteration and outputs a corrected version of them. As a result, we obtain a more accurate algorithm that combines the benefits of both belief propagation and graph neural networks. We apply our ideas to error correction decoding tasks, and we show that our algorithm can outperform belief propagation for LDPC codes on bursty channels.