Sindy Löwe

You are all cordially invited to the AMLab seminar on Thursday 14th November at 14:00 in C3.163, where Sindy Löwe will give a talk titled “Putting An End to End-to-End: Gradient-Isolated Learning of Representations”. There are the usual drinks and snacks!

Abstract: We propose a novel deep learning method for local self-supervised representation learning that does not require labels nor end-to-end backpropagation but exploits the natural order in data instead. Inspired by the observation that biological neural networks appear to learn without backpropagating a global error signal, we split a deep neural network into a stack of gradient-isolated modules. Each module is trained to maximally preserve the information of its inputs using the InfoNCE bound from Oord et al. [2018]. Despite this greedy training, we demonstrate that each module improves upon the output of its predecessor, and that the representations created by the top module yield highly competitive results on downstream classification tasks in the audio and visual domain. The proposal enables optimizing modules asynchronously, allowing large-scale distributed training of very deep neural networks on unlabelled datasets.

Talk by David Blei on The Blessings of Multiple Causes

You are all cordially invited to the UvA-Bosch Delta lab seminar on Thursday October 17th October at 15:00 on the Roeterseilandcampus A2.11 , where  David Blei, well known for his fantastic work on LDA, Bayesian nonparametrics, and variational inference. He will give a talk on “The Blessings of Multiple Causes”.


Causal inference from observational data is a vital problem, but itcomes with strong assumptions. Most methods require that we observeall confounders, variables that affect both the causal variables andthe outcome variables. But whether we have observed all confounders isa famously untestable assumption. We describe the deconfounder, a wayto do causal inference with weaker assumptions than the classicalmethods require.
How does the deconfounder work? While traditional causal methodsmeasure the effect of a single cause on an outcome, many modernscientific studies involve multiple causes, different variables whoseeffects are simultaneously of interest. The deconfounder uses thecorrelation among multiple causes as evidence for unobservedconfounders, combining unsupervised machine learning and predictivemodel checking to perform causal inference.  We demonstrate thedeconfounder on real-world data and simulation studies, and describethe theoretical requirements for the deconfounder to provide unbiasedcausal estimates.
This is joint work with Yixin Wang.


David Blei is a Professor of Statistics and Computer Science atColumbia University, and a member of the Columbia Data ScienceInstitute. He studies probabilistic machine learning, including itstheory, algorithms, and application. David has received several awardsfor his research, including a Sloan Fellowship (2010), Office of NavalResearch Young Investigator Award (2011), Presidential Early CareerAward for Scientists and Engineers (2011), Blavatnik Faculty Award(2013), ACM-Infosys Foundation Award (2013), a Guggenheim fellowship(2017), and a Simons Investigator Award (2019). He is theco-editor-in-chief of the Journal of Machine Learning Research.  He isa fellow of the ACM and the IMS.

Talk by Will Grathwohl

You are all cordially invited to the special AMLab seminar on Tuesday 15th October at 12:00 in C1.112, where Will Grathwohlfrom David Duvenaud’s group in Toronto will give a talk titled “The many virtues of Incorporating energy-based generative models into discriminative learning”

Will is one of the authors behind many great recent papers. To name a few: 

Abstract: Generative models have long been promised to benefit downstream discriminative machine learning applications such as out-of-distribution detection, adversarial robustness, uncertainty quantification, semi-supervised learning and many others.  Yet, except for a few notable exceptions, methods for these tasks based on generative models are considerably outperformed by hand-tailored methods for each specific task. In this talk, I will advocate for the incorporation of energy-based generative models into the standard discriminative learning framework. Energy-Based Models (EBMs) can be much more easily incorporated into discriminative models than alternative generative modeling approaches and can benefit from network architectures designed for discriminative performance. I will present a novel method for jointly training EBMs alongside classifiers and demonstrate that this approach allows us to build models which rival the performance of state-of-the-art generative models and discriminative models within a single model. Further, we demonstrate our joint model gains many desirable properties such as a built-in mechanism for out-of-distribution detection, improved calibration, and improved robustness to adversarial examples — rivaling or improving upon hand-designed methods for each task. 

Talk by Andy Keller

You are all cordially invited to the AMLab seminar on Thursday 10th October at 14:00 in D1.113, where Andy Keller will give a talk titled “Approaches to Learning Approximate Equivariance”. There are the usual drinks and snacks!

Abstract: In this talk we will discuss a few proposed approaches to learning approximate equivariance directly from data. These approaches range from weakly supervised to fully unsupervised, relying on either mutual information bounds or inductive biases respectively. Critical discussion will be encouraged as much of the work is in early phases. Preliminary results will be shown to demonstrate validity of concepts.

Talk by Bhaskar Rao

You are all cordially invited to the AMLab seminar on Thursday 3rd October at 14:00 in B0.201, where Bhaskar Rao (visiting researcher: bio below)will give a talk titled “Scale Mixture Modeling of Priors for Sparse Signal Recovery”. There are the usual drinks and snacks!

Abstract: This talk will discuss Bayesian approaches to solving the sparse signal recovery problem. In particular, methods based on priors that admit a scale mixture representation will be discussed with emphasis on Gaussian scale mixture modeling. In the context of MAP estimation, iterative reweighted approaches will be developed. The scale mixture modeling naturally leads a hierarchical framework and empirical Bayesian methods motivated by this hierarchy will be highlighted. The pros and cons of the two approaches, MAP versus Empirical Bayes, will be a subject of discussion.

Talk by Marco Federici

You are all cordially invited to the AMLab seminar on Thursday 12th September at 14:00 in C4.174, where Marco Federici will give a talk titled “Towards Robust Representations by Exploiting Multiple Data Views”. There are the usual drinks and snacks!

Abstract: The problem of creating data representations can be formulated as the definition of an encoding function which maps observations into a predefined code space. Whenever the encoding is used as an intermediate step for a predictive task, among the possible encodings, we are generally interested in the ones that retain the desired target information. Furthermore, recent literature has shown that discarding irrelevant factors of variation in the data (minimality) yield robustness and invariance to nuisances of the task. Following these two general guidelines, in this work, we introduce an information-theoretical method that exploits some known properties of the predictive task to create robust data representations without requiring direct supervision signals. By exploiting pairs of joint observations, our model learns representations that are as discriminative as the original data for the predictive task while being more robust than the raw-signal. The proposed theory builds upon well-known self-supervised algorithms (such as Contrastive Predictive Coding and the InfoMax principle), bridging the gap between information bottleneck and probabilistic invariance. Empirical evidence shows the applicability of our model for both multi-view and single-view datasets.

Talk by Wouter van Amsterdam

You are all cordially invited to the AMLab seminar on Thursday September 5th at 16:00 in C3.163, where Wouter van Amsterdam will give a talk titled “Controlling for Biasing Signals in Images for Prognostic Models: Survival Predictions for Lung Cancer with Deep Learning”. Afterwards there are the usual drinks and snacks!

Abstract: Deep learning has shown remarkable results for image analysis and is expected to aid individual treatment decisions in health care. Treatment recommendations are predictions with an inherently causal interpretation. To use deep learning for these applications, deep learning methods must be promoted from the level of mere associations to causal questions. We present a scenario with real-world medical images (CT-scans of lung cancers) and simulated outcome data. Through the data simulation scheme, the images contain two distinct factors of variation that are associated with survival, but represent a collider (tumor size) and a prognostic factor (tumor heterogeneity) respectively. We show that when this collider can be quantified, unbiased individual prognosis predictions are attainable with deep learning. This is achieved by (1) setting a dual task for the network to predict both the outcome and the collider and (2) enforcing a form of independence of the activation distributions of the last layer. Our method provides an example of combining deep learning and structural causal models to achieve unbiased individual prognosis predictions. Extensions of machine learning methods for applications to causal questions are required to attain the long standing goal of personalized medicine supported by artificial intelligence.

Talk by Karen Ullrich

You are all cordially invited to the AMLab seminar on Thursday June 20th at 16:00 in C3.163, where Karen Ullrich will give a talk titled “Differentiable probabilistic models of scientific imaging with the Fourier slice theorem”. Afterwards there are the usual drinks and snacks!

Abstract: Scientific imaging techniques such as optical and electron microscopy and computed tomography (CT) scanning are used to study the 3D structure of an object through 2D observations.  These observations are related to the original 3D object through orthogonal integral projections. For common 3D reconstruction algorithms, computational efficiency requires the modeling of the 3D structures to take place in Fourier space by applying the Fourier slice theorem. At present, it is unclear how to differentiate through the projection operator, and hence current learning algorithms can not rely on gradient based methods to optimize 3D structure models.  In this paper we show how back-propagation through the projection operator in Fourier space can be achieved. We demonstrate the validity of the approach with experiments on 3D reconstruction of proteins. We further extend our approach to learning probabilistic models of 3D objects. This allows us to predict regions of low sampling rates or estimate noise. A higher sample efficiency can be reached by utilizing the learned uncertainties of the 3D structure as an unsupervised estimate of the model fit. Finally, we demonstrate how the reconstruction algorithm can be extended with an amortized inference scheme on unknown attributes such as object pose. Through empirical studies we show that joint inference of the 3D structure and the object pose becomes more difficult when the ground truth object contains more symmetries. Due to the presence of for instance (approximate) rotational symmetries, the pose estimation can easily get stuck in local optima, inhibiting a fine-grained high-quality estimate of the 3D structure.

Talk by Wouter Kool

You are all cordially invited to the AMLab seminar on Thursday June 6th at 16:00 in C3.163, where Wouter Kool will give a talk titled “Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement”. Afterwards there are the usual drinks and snacks!

Abstract: The well-known Gumbel-Max trick for sampling from a categorical distribution can be extended to sample k elements without replacement. We show how to implicitly apply this ‘Gumbel-Top-k’ trick on a factorized distribution over sequences, allowing to draw exact samples without replacement using a Stochastic Beam Search. Even for exponentially large domains, the number of model evaluations grows only linear in k and the maximum sampled sequence length. The algorithm creates a theoretical connection between sampling and (deterministic) beam search and can be used as a principled intermediate alternative. In a translation task, the proposed method compares favourably against alternatives to obtain diverse yet good quality translations. We show that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.