You are all cordially invited to the AMLab seminar on **Monday Mar 18th at 15:00** (Note the non-standard date/time) in C3.163, where Benjamin Bloem-Reddy will give a talk titled “Probabilistic symmetry and invariant neural networks”. Afterwards there are the usual drinks and snacks!
Abstract: In an effort to improve the performance of deep neural networks in data-scarce, non-i.i.d., or unsupervised settings, much recent research has been devoted to encoding invariance under symmetry transformations into neural network architectures. We treat the neural network input and output as random variables, and consider group invariance from the perspective of probabilistic symmetry. Drawing on tools from probability and statistics, we establish a link between functional and probabilistic symmetry, and obtain functional representations of probability distributions that are invariant or equivariant under the action of a compact group. Those representations characterize the structure of neural networks that can be used to represent such distributions and yield a general program for constructing invariant stochastic or deterministic neural networks. We develop the details of the general program for exchangeable sequences and arrays, recovering a number of recent examples as special cases.
You are all cordially invited to the AMLab seminar on Thursday Mar 14th at 16:00 in C3.163, where Changyong Oh will give a talk titled “Combinatorial Bayesian Optimization using Graph Representations”. Afterwards there are the usual drinks and snacks!
Abstract: This paper focuses on Bayesian Optimization – typically considered with continuous inputs – for discrete search input spaces, including integer, categorical or graph structured input variables. In Gaussian process-based Bayesian Optimization a problem arises, as it is not straightforward to define a proper kernel on discrete input structures, where no natural notion of smoothness or similarity could be provided. We propose COMBO, a method that represents values of discrete variables as vertices of a graph and then use the diffusion kernel on that graph. As the graph size explodes with the number of categorical variables and categories, we propose the graph Cartesian product to decompose the graph into smaller sub-graphs, enabling kernel computation in linear time with respect to the number of input variables. Moreover, in our formulation we learn a scale parameter per subgraph. In empirical studies on four discrete optimization problems we demonstrate that our method is on par or outperforms the state-of-the-art in discrete Bayesian optimization.
You are all cordially invited to the AMLab seminar on Thursday Feb 28th at 16:00 in C3.163, where Christos Louizos will give a talk titled “Learning Exchangeable Distributions”. Afterwards there are the usual drinks and snacks!
Abstract: We present a new family of models that directly parametrize exchangeable distributions; it is realized via the introduction of an explicit model for the dependency structure of the joint probability distribution over the data, while respecting the permutation invariance of an exchangeable distribution. This is achieved by combining two recent advances in variational inference and probabilistic modelling for graphs, normalizing flows and (di)graphons. We, empirically, demonstrate that such models are also approximately consistent, hence they can also provide epistemic uncertainty about their predictions without positing an explicit prior over global variables. We show how to train such models on data and evaluate their predictive capabilities as well as the quality of their uncertainty on various tasks.
You are all cordially invited to the AMLab seminar on Thursday Feb 21st at 16:00 in C3.163, where Thomas Kipf will give a talk titled “Compositional Imitation Learning: Explaining and executing one task at a time”. Afterwards there are the usual drinks and snacks!
Abstract: We introduce a framework for Compositional Imitation Learning and Execution (CompILE) of hierarchically-structured behavior. CompILE learns reusable, variable-length segments of behavior from demonstration data using a novel unsupervised, fully-differentiable sequence segmentation module. These learned behaviors can then be re-composed and executed to perform new tasks. At training time, CompILE auto-encodes observed behavior into a sequence of latent codes, each corresponding to a variable-length segment in the input sequence. Once trained, our model generalizes to sequences of longer length and from environment instances not seen during training. We evaluate our model in a challenging 2D multi-task environment and show that CompILE can find correct task boundaries and event encodings in an unsupervised manner without requiring annotated demonstration data. Latent codes and associated behavior policies discovered by CompILE can be used by a hierarchical agent, where the high-level policy selects actions in the latent code space, and the low-level, task-specific policies are simply the learned decoders. We found that our agent could learn given only sparse rewards, where agents without task-specific policies struggle.
You are all cordially invited to the AMLab seminar on Thursday 14th Feb at 16:00 in C3.163, where Victor Garcia will give a talk titled “GRIN: Graphical Recurrent Inference Networks“. Afterwards there are the usual drinks and snacks!
Abstract: A graphical model is a structured representation of the data generating process. The traditional method to reason over random variables is to perform inference in this graphical model. However, in many cases the generating process is only a poor approximation of the much more complex true data generation process, leading to poor posterior estimates. The subtleties of the generative process are however captured in the data itself and we can “learn to infer”, that is, learn a direct mapping from observations to explanatory latent variables. In this work we propose a hybrid model that combines graphical inference with a learned inverse model, which we structure as a graph neural network. The iterative algorithm is formulated as a recurrent neural network. By using cross-validation we can automatically balance the amount of work performed by graphical inference versus learned inference. We apply our ideas to the Kalman filter, a Gaussian hidden Markov model for time sequences. We apply our “Graphical Recurrent Inference” method to a number of path estimation tasks and show that it successfully outperforms either learned or graphical inference run in isolation.
You are all cordially invited to the AMLab seminar on Thursday January 31 at 16:00 in C3.163, where Emiel Hoogeboom will give a talk titled “Emerging Convolutions for Generative Normalizing Flows”. Afterwards there are the usual drinks and snacks!
Abstract: Generative flows are attractive because they admit exact likelihood optimization and efficient image synthesis. Recently, Kingma & Dhariwal (2018) demonstrated with Glow that generative flows are capable of generating high quality images. We generalize the 1 x 1 convolutions proposed in Glow to invertible d x d convolutions, which are more flexible since they operate on both channel and spatial axes. We propose two methods to produce invertible convolutions that have receptive fields identical to standard convolutions: Emerging convolutions are obtained by chaining specific autoregressive convolutions, and periodic convolutions are decoupled in the frequency domain. Our experiments show that the flexibility of d x d convolutions significantly improves the performance of generative flow models on galaxy images, CIFAR10 and ImageNet.
You are all cordially invited to the AMLab seminar on Thursday January 17 at 16:00 in C3.163, where Herke van Hoof will give a talk titled “Learning Selective Coverage Strategies for Surveying and Search”. Afterwards there are the usual drinks and snacks!
Abstract: In this seminar, I’ll present a project I’ve been working on with Sandeep Manjanna and Gregory Dudek (Mobile Robotics Lab, McGill University). In this project, we investigated selective coverage strategies for a robot tasked with surveying or searching prioritised locations in a given area. This problem can be modelled as a Markov decision process and solved with reinforcement learning strategies, but the state space is extremely large, requiring these states to be aggregated. The proposed state aggregation method is shown to generalize well between different environments. In field tests over reefs at the Folkestone Marine Reserve, using this method an autonomous surface vehicle was able to improve the number of useable visual data samples.
You are all cordially invited to the AMLab seminar on Thursday December 13 at 16:00 in C3.163 (FNWI, Amsterdam Science Park), where Tom Claassen (Radboud/UvA) will give a talk titled “Causal discovery from real-world data: relaxing the faithfulness assumption”. Afterwards there are the usual drinks and snacks.
Abstract: The so-called causal Markov and causal faithfulness assumptions are well-established pillars behind causal discovery from observational data. The first is closely related to the memorylessness property of dynamical systems, and allows us to predict observable conditional independencies in the data from the underlying causal model. The second is the causal equivalent of Ockham’s razor, and enables us to reason backwards from data to the causal model of interest.
Though theoretically reasonable, in practice with limited data from real-world systems we often encounter violations of faithfulness. Some of these, like weak long-distance interactions, are handled surprisingly well by benchmark constraint-based algorithms such as FCI. Other violations may imply inconsistencies between observed (conditional) independence statements in the data that cannot currently be handled both effectively and efficiently by most constraint based algorithms. A fundamental question is whether our output retains any validity when not all our assumptions are satisfied, or whether it is still possible to reliably rescue parts of the model.
In this talk we introduce a novel approach based on a relaxed form of the faithfulness assumption that is able to handle many of the detectable faithfulness violations efficiently while ensuring the output causal model remains valid. Essentially we obtain a principled and efficient form of error-correction on observed in/dependencies, that can significantly improve both accuracy and reliability of the output causal models in practice. True; it cannot handle all possible violations, but the relaxed faithfulness assumption may be a promising step towards a more realistic, and so more effective, underpinning of the challenging task of causal discovery from real-world systems.
You are all cordially invited to the AMLab seminar on Thursday November 29 at 16:00 in C3.163, where Daniel Worrall will give a talk titled “Semigroup Convolutional Neural Networks: Merging Scale-space and Deep Learning”. Afterwards there are the usual drinks and snacks!
Abstract: Group convolutional neural networks (GCNN) are symmetric under predefined, invertible transformations in the input e.g. rotations, flips, and translations. Can we extend this framework in the absence of invertibility, for instance in the case of pixelated image downscalings, or causal time-shifting of audio signals? To this end, I present Semigroup Convolutional Neural Networks (SCNN), a generalisation of GCNNs based on the related theory of semigroups. I will showcase a specialisation of a scale-equivariant SCNN, where the activations of each layer of the network live on a classical scale-space, finally linking the classical field of scale-spaces and modern deep learning.
You are all cordially invited to the AMLab seminar on Thursday November 22 at 16:00 in C3.163 (FNWI, Amsterdam Science Park), where Maurice Weiler will give a talk titled “3D Steerable CNNs”. Afterwards there are the usual drinks and snacks.
Abstract: We present a convolutional network that is equivariant to rigid body motions. The model uses scalar-, vector-, and tensor fields over 3D Euclidean space to represent data, and equivariant convolutions to map between such representations. These SE(3)-equivariant convolutions utilize kernels which are parameterized as a linear combination of a complete steerable kernel basis, which is derived analytically in this paper. We prove that equivariant convolutions are the most general equivariant linear maps between fields over R^3. Our experimental results confirm the effectiveness of 3D Steerable CNNs for the problem of amino acid propensity prediction and protein structure classification, both of which have inherent SE(3) symmetry.