AMLab | Amsterdam Machine Learning Lab

Maurice Weiler

PhD candidate (advised by Max Welling and Erik Verlinde)
AMLab
Institute of Informatics
University of Amsterdam
Science Park 904, C3.250a

Personal page Google scholar Github Twitter

Research on equivariant neural networks, specifically steerable CNNs and coordinate independent convolutions on Riemannian manifolds.

Selected Publications

Book

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

Zhang, Xuan, Wang, Limei, Helwig, Jacob, Luo, Youzhi, Fu, Cong, Xie, Yaochen, Liu, Meng, Lin, Yuchao, Xu, Zhao, Yan, Keqiang, Adams, Keir, Weiler, Maurice, Li, Xiner, Fu, Tianfan, Wang, Yucheng, Yu, Haiyang, Xie, YuQing, Fu, Xiang, Strasser, Alex, Xu, Shenglong, Liu, Yi, Du, Yuanqi, Saxton, Alexandra, Ling, Hongyi, Lawrence, Hannah, Stärk, Hannes, Gui, Shurui, Edwards, Carl, Gao, Nicholas, Ladera, Adriana, Wu, Tailin, Hofgard, Elyssa F., Tehrani, Aria Mansouri, Wang, Rui, Daigavane, Ameya, Bohde, Montgomery, Kurtin, Jerry, Huang, Qian, Phung, Tuong, Xu, Minkai, Joshi, Chaitanya K., Mathis, Simon V., Azizzadenesheli, Kamyar, Fang, Ada, Aspuru-Guzik, Alán, Bekkers, Erik, Bronstein, Michael, Zitnik, Marinka, Anandkumar, Anima, Ermon, Stefano, Liò, Pietro, Yu, Rose, Günnemann, Stephan, Leskovec, Jure, Ji, Heng, Sun, Jimeng, Barzilay, Regina, Jaakkola, Tommi, Coley, Connor W., Qian, Xiaoning, Qian, Xiaofeng, Smidt, Tess, and Ji, Shuiwang

In arXiv preprint arXiv:2307.08423 03–05 may 2023

Abs HTML PDF

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This paper aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.
Book

Equivariant and Coordinate Independent Convolutional Networks - A Gauge Field Theory of Neural Networks

Weiler, Maurice, Forré, Patrick, Verlinde, Erik, and Welling, Max

In 03–05 may 2023

HTML PDF
ICLR

Steerable Partial Differential Operators for Equivariant Neural Networks

Jenner, Erik, and Weiler, Maurice

In International Conference on Learning Representations (ICLR) 03–05 may 2022

Abs HTML PDF

Recent work in equivariant deep learning bears strong similarities to physics. Fields over a base space are fundamental entities in both subjects, as are equivariant maps between these fields. In deep learning, however, these maps are usually defined by convolutions with a kernel, whereas they are partial differential operators (PDOs) in physics. Developing the theory of equivariant PDOs in the context of deep learning could bring these subjects even closer together and lead to a stronger flow of ideas. In this work, we derive a G-steerability constraint that completely characterizes when a PDO between feature vector fields is equivariant, for arbitrary symmetry groups G. We then fully solve this constraint for several important groups. We use our solutions as equivariant drop-in replacements for convolutional layers and benchmark them in that role. Finally, we develop a framework for equivariant maps based on Schwartz distributions that unifies classical convolutions and differential operators and gives insight about the relation between the two.
ICLR

A program to build E(N)-equivariant steerable CNNs

Cesa, Gabriele, Lang, Leon, and Weiler, Maurice

In International Conference on Learning Representations (ICLR) 03–05 may 2022

Abs HTML PDF

Equivariance is becoming an increasingly popular design choice to build data efficient neural networks by exploiting prior knowledge about the symmetries of the problem at hand. Euclidean steerable CNNs are one of the most common classes of equivariant networks. While the constraints these architectures need to satisfy are understood, no practical method to parametrize them generally has been described so far, with most existing approaches tailored to specific groups or classes of groups. In this work, we generalize the Wigner-Eckart theorem proposed in (Lang et al.), which characterizes general G-steerable kernel spaces for compact groups G over their homogeneous spaces, to arbitrary G-spaces. This enables us to directly parameterize filters in terms of a band-limited basis on the base space, but also to easily implement steerable CNNs equivariant to a large number of groups. To demonstrate its generality, we instantiate our method on a large variety of isometry groups acting on the Euclidean space R^3. Our general framework allows us to build E(3) and SE(3)-steerable CNNs like previous works, but also CNNs with arbitrary G<=O(3)-steerable kernels. For example, we build 3D CNNs equivariant to the symmetries of platonic solids or choose G=SO(2) when working with 3D data having only azimuthal symmetries. We compare these models on 3D shapes and molecular datasets, observing improved performance by matching the model’s symmetries to the ones of the data.
ICLR

Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs

Haan, Pim, Weiler, Maurice, Cohen, Taco, and Welling, Max

In International Conference on Learning Representations (ICLR) 03–05 may 2021

Abs HTML PDF

A common approach to define convolutions on meshes is to interpret them as a graph and apply graph convolutional networks (GCNs). Such GCNs utilize isotropic kernels and are therefore insensitive to the relative orientation of vertices and thus to the geometry of the mesh as a whole. We propose Gauge Equivariant Mesh CNNs which generalize GCNs to apply anisotropic gauge equivariant kernels. Since the resulting features carry orientation information, we introduce a geometric message passing scheme defined by parallel transporting features over mesh edges. Our experiments validate the significantly improved expressivity of the proposed model over conventional GCNs and other methods.
ICLR

A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels

Lang, Leon, and Weiler, Maurice

In International Conference on Learning Representations (ICLR) 03–05 may 2020

Abs HTML PDF

Group equivariant convolutional networks (GCNNs) endow classical convolutional networks with additional symmetry priors, which can lead to a considerably improved performance. Recent advances in the theoretical description of GCNNs revealed that such models can generally be understood as performing convolutions with G-steerable kernels, that is, kernels that satisfy an equivariance constraint themselves. While the G-steerability constraint has been derived, it has to date only been solved for specific use cases - a general characterization of G-steerable kernel spaces is still missing. This work provides such a characterization for the practically relevant case of G being any compact group. Our investigation is motivated by a striking analogy between the constraints underlying steerable kernels on the one hand and spherical tensor operators from quantum mechanics on the other hand. By generalizing the famous Wigner-Eckart theorem for spherical tensor operators, we prove that steerable kernel spaces are fully understood and parameterized in terms of 1) generalized reduced matrix elements, 2) Clebsch-Gordan coefficients, and 3) harmonic basis functions on homogeneous spaces.
arXiv

Coordinate Independent Convolutional Networks - Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds

Weiler, Maurice, Forré, Patrick, Verlinde, Erik, and Welling, Max

In arXiv preprint arXiv:2106.06020 03–05 may 2020

Abs HTML PDF

Motivated by the vast success of deep convolutional networks, there is a great interest in generalizing convolutions to non-Euclidean manifolds. A major complication in comparison to flat spaces is that it is unclear in which alignment a convolution kernel should be applied on a manifold. The underlying reason for this ambiguity is that general manifolds do not come with a canonical choice of reference frames (gauge). Kernels and features therefore have to be expressed relative to arbitrary coordinates. We argue that the particular choice of coordinatization should not affect a network’s inference - it should be coordinate independent. A simultaneous demand for coordinate independence and weight sharing is shown to result in a requirement on the network to be equivariant under local gauge transformations (changes of local reference frames). The ambiguity of reference frames depends thereby on the G-structure of the manifold, such that the necessary level of gauge equivariance is prescribed by the corresponding structure group G. Coordinate independent convolutions are proven to be equivariant w.r.t. those isometries that are symmetries of the G-structure. The resulting theory is formulated in a coordinate free fashion in terms of fiber bundles. To exemplify the design of coordinate independent convolutions, we implement a convolutional network on the Möbius strip. The generality of our differential geometric formulation of convolutional networks is demonstrated by an extensive literature review which explains a large number of Euclidean CNNs, spherical CNNs and CNNs on general surfaces as specific instances of coordinate independent convolutions.
NeurIPS

General E(2)-Equivariant Steerable CNNs

Weiler, Maurice, and Cesa, Gabriele

In Conference on Neural Information Processing Systems (NeurIPS) 03–05 may 2019

Abs HTML PDF

The big empirical success of group equivariant networks has led in recent years to the sprouting of a great variety of equivariant network architectures. A particular focus has thereby been on rotation and reflection equivariant CNNs for planar images. Here we give a general description of E(2)-equivariant convolutions in the framework of Steerable CNNs. The theory of Steerable CNNs thereby yields constraints on the convolution kernels which depend on group representations describing the transformation laws of feature spaces. We show that these constraints for arbitrary group representations can be reduced to constraints under irreducible representations. A general solution of the kernel space constraint is given for arbitrary representations of the Euclidean group E(2) and its subgroups. We implement a wide range of previously proposed and entirely new equivariant network architectures and extensively compare their performances. E(2)-steerable convolutions are further shown to yield remarkable gains on CIFAR-10, CIFAR-100 and STL-10 when used as drop in replacement for non-equivariant convolutions.
ICML workshop

Covariance in Physics and Convolutional Neural Networks

Cheng, Miranda, Anagiannis, Vassilis, Weiler, Maurice, Haan, Pim, Cohen, Taco, and Welling, Max

In ICML 2019 Workshop on Theoretical Physics for Deep Learning 03–05 may 2019

Abs HTML PDF

In this proceeding we give an overview of the idea of covariance (or equivariance) featured in the recent development of convolutional neural networks (CNNs). We study the similarities and differences between the use of covariance in theoretical physics and in the CNN context. Additionally, we demonstrate that the simple assumption of covariance, together with the required properties of locality, linearity and weight sharing, is sufficient to uniquely determine the form of the convolution.
NeurIPS

A General Theory of Equivariant CNNs on Homogeneous Spaces

Cohen, Taco S., Geiger, Mario, and Weiler, Maurice

In Conference on Neural Information Processing Systems (NeurIPS) 03–05 may 2019

Abs HTML PDF

We present a general theory of Group equivariant Convolutional Neural Networks (G-CNNs) on homogeneous spaces such as Euclidean space and the sphere. Feature maps in these networks represent fields on a homogeneous base space, and layers are equivariant maps between spaces of fields. The theory enables a systematic classification of all existing G-CNNs in terms of their symmetry group, base space, and field type. We also consider a fundamental question: what is the most general kind of equivariant linear map between feature spaces (fields) of given types? Following Mackey, we show that such maps correspond one-to-one with convolutions using equivariant kernels, and characterize the space of such kernels.
ICML

Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Cohen, Taco S., Weiler, Maurice, Kicanaoglu, Berkay, and Welling, Max

In International Conference on Machine Learning (ICML) 03–05 may 2019

Abs HTML PDF

The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry, and which includes many popular methods from equivariant and geometric deep learning. We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.
CVPR

Learning Steerable Filters for Rotation Equivariant CNNs

Weiler, Maurice, Hamprecht, Fred A., and Storath, Martin

In Conference on Computer Vision and Pattern Recognition (CVPR) 03–05 may 2018

Abs HTML PDF

In many machine learning tasks it is desirable that a model’s prediction transforms in an equivariant way under transformations of its input. Convolutional neural networks (CNNs) implement translational equivariance by construction; for other transformations, however, they are compelled to learn the proper mapping. In this work, we develop Steerable Filter CNNs (SFCNNs) which achieve joint equivariance under translations and rotations by design. The proposed architecture employs steerable filters to efficiently compute orientation dependent responses for many orientations without suffering interpolation artifacts from filter rotation. We utilize group convolutions which guarantee an equivariant mapping. In addition, we generalize He’s weight initialization scheme to filters which are defined as a linear combination of a system of atomic filters. Numerical experiments show a substantial enhancement of the sample complexity with a growing number of sampled filter orientations and confirm that the network generalizes learned patterns over orientations. The proposed approach achieves state-of-the-art on the rotated MNIST benchmark and on the ISBI 2012 2D EM segmentation challenge.
NeurIPS

3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data

Weiler, Maurice, Geiger, Mario, Welling, Max, Boomsma, Wouter, and Cohen, Taco S.

In Conference on Neural Information Processing Systems (NeurIPS) 03–05 may 2018

Abs HTML PDF

We present a convolutional network that is equivariant to rigid body motions. The model uses scalar-, vector-, and tensor fields over 3D Euclidean space to represent data, and equivariant convolutions to map between such representations. These SE(3)-equivariant convolutions utilize kernels which are parameterized as a linear combination of a complete steerable kernel basis, which is derived analytically in this paper. We prove that equivariant convolutions are the most general equivariant linear maps between fields over R^3 . Our experimental results confirm the effectiveness of 3D Steerable CNNs for the problem of amino acid propensity prediction and protein structure classification, both of which have inherent SE(3) symmetry.
arXiv

Intertwiners between Induced Representations (with Applications to the Theory of Equivariant Neural Networks)

Cohen, Taco S., Geiger, Mario, and Weiler, Maurice

In arXiv preprint arXiv:1803.10743 03–05 may 2018

Abs HTML PDF

Group equivariant and steerable convolutional neural networks (regular and steerable G-CNNs) have recently emerged as a very effective model class for learning from signal data such as 2D and 3D images, video, and other data where symmetries are present. In geometrical terms, regular G-CNNs represent data in terms of scalar fields (“feature channels”), whereas the steerable G-CNN can also use vector or tensor fields (“capsules”) to represent data. In algebraic terms, the feature spaces in regular G-CNNs transform according to a regular representation of the group G, whereas the feature spaces in Steerable G-CNNs transform according to the more general induced representations of G. In order to make the network equivariant, each layer in a G-CNN is required to intertwine between the induced representations associated with its input and output space. In this paper we present a general mathematical framework for G-CNNs on homogeneous spaces like Euclidean space or the sphere. We show, using elementary methods, that the layers of an equivariant network are convolutional if and only if the input and output feature spaces transform according to an induced representation. This result, which follows from G.W. Mackey’s abstract theory on induced representations, establishes G-CNNs as a universal class of equivariant network architectures, and generalizes the important recent work of Kondor & Trivedi on the intertwiners between regular representations. In order for a convolution layer to be equivariant, the filter kernel needs to satisfy certain linear equivariance constraints. The space of equivariant kernels has a rich and interesting structure, which we expose using direct calculations. Additionally, we show how this general understanding can be used to compute a basis for the space of equivariant filter kernels, thereby providing a straightforward path to the implementation of G-CNNs for a wide range of groups and manifolds.
ICML workshop

Explorations in homeomorphic variational auto-encoding

Falorsi, Luca, Haan, Pim, Davidson, Tim R, De Cao, Nicola, Weiler, Maurice, Forré, Patrick, and Cohen, Taco S

In ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models 03–05 may 2018

Abs HTML PDF

The manifold hypothesis states that many kinds of high-dimensional data are concentrated near a low-dimensional manifold. If the topology of this data manifold is non-trivial, a continuous encoder network cannot embed it in a one-to-one manner without creating holes of low density in the latent space. This is at odds with the Gaussian prior assumption typically made in Variational Auto-Encoders (VAEs), because the density of a Gaussian concentrates near a blob-like manifold. In this paper we investigate the use of manifold-valued latent variables. Specifically, we focus on the important case of continuously differentiable symmetry groups (Lie groups), such as the group of 3D rotations SO(3). We show how a VAE with SO(3)-valued latent variables can be constructed, by extending the reparameterization trick to compact connected Lie groups. Our experiments show that choosing manifold-valued latent variables that match the topology of the latent data manifold, is crucial to preserve the topological structure and learn a well-behaved latent space.