Hi everyone, You are all cordially invited to the AMLab Seminar on Thursday 29th October at 2:00 p.m CET on Zoom, where Andy Keller will give a talk titled “Self Normalizing Flows “. (Note that the time slot for this talk is modified a bit, 2 hours advanced than previous ones, it will be appreciated if you can save this in your calendar.)
Title: Self Normalizing Flows
Abstract: Efficient gradient computation of the Jacobian determinant term is a core problem of the normalizing flow framework. Thus, most proposed flow models either restrict to a function class with easy evaluation of the Jacobian determinant, or an efficient estimator thereof. However, these restrictions limit the performance of such density models, frequently requiring significant depth to reach desired performance levels. In this work, we propose Self Normalizing Flows, a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer. This reduces the computational complexity of each layer’s exact update from O(D^3) to O(D^2), allowing for the training of flow architectures which were otherwise computationally infeasible, while also providing efficient sampling. We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts, while surpassing the performance of their functionally constrained counterparts.
To gain more deep insights into this recently developed normalizing flow, feel free to join and discuss it 🙂 !
Hey, guys~ You are all cordially invited to the AMLab Seminar on Thursday 15th October at 16:00 CEST on Zoom, where Wouter Kool will give a talk titled “Gumbel Mathemagic“.
Title: Gumbel Mathemagic
Abstract: Those who have seen the talk “Stochastic Beams and Where to Find Them” (https://www.facebook.com/icml.imls/videos/895968107420746/) can tune in 20 mins late as I will explain to you the mathemagic behind Stochastic Beam Search, an extension of the Gumbel-Max trick that enables sampling sequences without replacement. After that I will discuss Ancestral-Gumbel-Top-k Sampling, which is a generalization of Stochastic Beam Search. Finally, I will derive a multi-sample REINFORCE estimator with built-in baseline, based on sampling without replacement. All made possible by the humble Gumbel! 🙂 Bring your own snacks!
To gain more deep insights into Gumbel tricks and how to stabilize gradient estimates, feel free to join and discuss it!
Hi everyone, you are all cordially invited to the AMLab Seminar on Thursday 8th October at 16:00 CEST on Zoom, where Eric Nalisnick will give a talk titled ” Specifying Priors on Predictive Complexity “.
Title: Specifying Priors on Predictive Complexity
Abstract: Specifying a Bayesian prior is notoriously difficult for complex models such as neural networks. Reasoning about parameters is made challenging by the high-dimensionality and over-parameterization of the space. Priors that seem benign and uninformative can have unintuitive and detrimental effects on a model’s predictions. For this reason, we propose predictive complexity priors: a functional prior that is defined by comparing the model’s predictions to those of a reference function. Although originally defined on the model outputs, we transfer the prior to the model parameters via a change of variables. The traditional Bayesian workflow can then proceed as usual. We apply our predictive complexity prior to modern machine learning tasks such as reasoning over neural network depth and sharing of statistical strength for few-shot learning.
Link to paper : https://arxiv.org/abs/2006.10801 To gain more deep insights into priors in Bayesian models, feel free to join and discuss it!
Hi everyone, we have a guest speaker Nutan Chen from ARGMAX.AI and you are all cordially invited to the AMLab Seminar on Thursday 1st October at 16:00 CEST on Zoom, where Nutan will give a talk titled ” Distance in Latent Space “.
Title : Distance in Latent Space
Abstract : Measuring the similarity between data points often requires domain knowledge. It can in parts be compensated by relying on unsupervised methods such as latent-variable models, where the similarity/distance is estimated in a more compact latent space. However, deep generative models such as vanilla VAEs are not distance-preserving. Therefore, this type of model is unreliable for tasks such as precise distance measurement or smooth interpolation directly from the latent space. To solve this problem, we proposed novel methods based VAEs to constrain or measure the distance in the latent space.
In the first section of this talk, I will explore a method that embeds dynamic movement primitives into the latent space of a time-dependent VAE framework (deep variational Bayes filters). Experimental results show that our framework generalizes well, e.g., switches between movements or changing goals. Additionally, the distance between two data points that are close in time is constrained, which results in influencing the data structure of the hidden space. In the second section, I will show how we transferred ideas from Riemannian geometry to deep generative models, letting the distance between two points be the shortest path on a Riemannian manifold induced by the transformation. The method yields a principled distance measure, provides a tool for visual inspection of deep generative models, and an alternative to linear interpolation in latent space. In the third section, I will propose an extension to the framework of VAEs that allows learning flat latent manifolds, where the Euclidean metric is a proxy for the similarity between data points. This is achieved by defining the latent space as a Riemannian manifold and by regularizing the metric tensor to be a scaled identity matrix. This results in a computational efficient distance metric which is practical for applications in real-time scenarios.
Paper Link : Learning flat manifold of VAEs. In International Conference on Machine Learning (ICML). 2020.
To gain more deep insights into connections
between VAEs and manifolds and see how these are applied to robotics, feel free
to join and discuss it!
Hi, guys~ We have a guest speaker Abubakar Abid and you are all cordially invited to the AMLab Seminar on Thursday 17th September at 16:00 CEST on Zoom, where Abubakar will give a talk titled ” Interactive UIs for Your Machine Learning Models “.
Title: Interactive UIs for Your Machine Learning
Abstract: Accessibility is a major challenge of machine
learning (ML). Typical ML models are built by specialists and require
specialized hardware/software as well as ML experience to validate. This makes
it challenging for non-technical collaborators and endpoint users (e.g.
physicians) to easily provide feedback on model development and to gain trust
in ML. The accessibility challenge also makes collaboration more difficult and
limits the ML researcher’s exposure to realistic data and scenarios that occur
in the wild. To improve accessibility and facilitate collaboration, we
developed an open-source Python package, Gradio, which allows researchers to
rapidly generate a visual interface for their ML models. Gradio makes accessing
any ML model as easy as opening a URL in your browser. Our development of
Gradio is informed by interviews with a number of machine learning researchers
who participate in interdisciplinary collaborations. We developed these
features and carried out a case study to understand Gradio’s usefulness and
usability in the setting of a machine learning collaboration between a
researcher and a cardiologist.
To gain more deep insights into understanding your machine learning models, feel free to join and discuss it! See you there!
You are all cordially invited to the AMLab Seminar on Thursday 10th September at 16:00 CEST on Zoom, where Elise van der Pol will give a talk titled “MDP Homomorphic Networks for Deep Reinforcement Learning “.
Paper link: https://arxiv.org/pdf/2006.16908.pdf
Title: MDP Homomorphic Networks for Deep Reinforcement Learning
Abstract: This talk discusses MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done.
We construct MDP
homomorphic MLPs and CNNs that are equivariant under either a group of
reflections or rotations. We show that such networks converge faster than
unstructured baselines on CartPole, a grid world and Pong.
To gain more deep insights
on Deep Reinforcement Learning, feel free to join it and discuss!
You are all cordially invited to the AMLab Seminar on Thursday 3rd September at 16:00 CEST on Zoom, where Didrik Nielsen will give a talk titled “SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows”.
Paper link: https://arxiv.org/abs/2007.02731
Title: SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows
Abstract: Normalizing flows and variational autoencoders are powerful generative models that can represent complicated density functions. However, they both impose constraints on the models: Normalizing flows use bijective transformations to model densities whereas VAEs learn stochastic transformations that are non-invertible and thus typically do not provide tractable estimates of the marginal likelihood. In this paper, we introduce SurVAE Flows: A modular framework of composable transformations that encompasses VAEs and normalizing flows. SurVAE Flows bridge the gap between normalizing flows and VAEs with surjective transformations, wherein the transformations are deterministic in one direction — thereby allowing exact likelihood computation, and stochastic in the reverse direction — hence providing a lower bound on the corresponding likelihood. We show that several recently proposed methods, including dequantization and augmented normalizing flows, can be expressed as SurVAE Flows. Finally, we introduce common operations such as the max value, the absolute value, sorting and stochastic permutation as composable layers in SurVAE Flows.
Hi everyone, you are all cordially invited to the AMLab Seminar on Thursday 30th July at 16:00 CEST on Zoom, where Pim de Haan will give a talk titled “Natural Graph Networks“.
Paper link: https://arxiv.org/abs/2007.08349
Title: Natural Graph Networks
Abstract: Conventional neural message passing algorithms are invariant under permutation of the messages and hence forget how the information flows through the network. Studying the local symmetries of graphs, we propose a more general algorithm that uses different kernels on different edges, making the network equivariant to local and global graph isomorphisms and hence more expressive. Using elementary category theory, we formalize many distinct equivariant neural networks as natural networks, and show that their kernels are ‘just’ a natural transformation between two functors. We give one practical instantiation of a natural network on graphs which uses an equivariant message network parameterization, yielding good performance on several benchmarks.
You are all cordially invited to the UvA-Bosch Delta lab seminar on Thursday October 17th October at 15:00 on the Roeterseilandcampus A2.11 , where David Blei, well known for his fantastic work on LDA, Bayesian nonparametrics, and variational inference. He will give a talk on “The Blessings of Multiple Causes”.
Causal inference from observational data is a vital problem, but itcomes with strong assumptions. Most methods require that we observeall confounders, variables that affect both the causal variables andthe outcome variables. But whether we have observed all confounders isa famously untestable assumption. We describe the deconfounder, a wayto do causal inference with weaker assumptions than the classicalmethods require.
How does the deconfounder work? While traditional causal methodsmeasure the effect of a single cause on an outcome, many modernscientific studies involve multiple causes, different variables whoseeffects are simultaneously of interest. The deconfounder uses thecorrelation among multiple causes as evidence for unobservedconfounders, combining unsupervised machine learning and predictivemodel checking to perform causal inference. We demonstrate thedeconfounder on real-world data and simulation studies, and describethe theoretical requirements for the deconfounder to provide unbiasedcausal estimates.
This is joint work with Yixin Wang.
David Blei is a Professor of Statistics and Computer Science atColumbia University, and a member of the Columbia Data ScienceInstitute. He studies probabilistic machine learning, including itstheory, algorithms, and application. David has received several awardsfor his research, including a Sloan Fellowship (2010), Office of NavalResearch Young Investigator Award (2011), Presidential Early CareerAward for Scientists and Engineers (2011), Blavatnik Faculty Award(2013), ACM-Infosys Foundation Award (2013), a Guggenheim fellowship(2017), and a Simons Investigator Award (2019). He is theco-editor-in-chief of the Journal of Machine Learning Research. He isa fellow of the ACM and the IMS.
You are all cordially invited to the special AMLab seminar on Tuesday 15th October at 12:00 in C1.112, where Will Grathwohl, from David Duvenaud’s group in Toronto will give a talk titled “The many virtues of Incorporating energy-based generative models into discriminative learning”.
Will is one of the authors behind many great recent papers. To name a few:
Abstract: Generative models have long been promised to benefit downstream discriminative machine learning applications such as out-of-distribution detection, adversarial robustness, uncertainty quantification, semi-supervised learning and many others. Yet, except for a few notable exceptions, methods for these tasks based on generative models are considerably outperformed by hand-tailored methods for each specific task. In this talk, I will advocate for the incorporation of energy-based generative models into the standard discriminative learning framework. Energy-Based Models (EBMs) can be much more easily incorporated into discriminative models than alternative generative modeling approaches and can benefit from network architectures designed for discriminative performance. I will present a novel method for jointly training EBMs alongside classifiers and demonstrate that this approach allows us to build models which rival the performance of state-of-the-art generative models and discriminative models within a single model. Further, we demonstrate our joint model gains many desirable properties such as a built-in mechanism for out-of-distribution detection, improved calibration, and improved robustness to adversarial examples — rivaling or improving upon hand-designed methods for each task.