Author Archives: Joris Mooij

About Joris Mooij

Assistant Professor

Tineke Blom joins AMLab

Tineke Blom joined AMLab as a PhD student on September 1st. Tineke studied Mathematical Sciences at Utrecht University and her research interests include causality, applied mathematics, and learning algorithms. She will conduct research on causality as part of the ERC starting grant project CAFES led by Joris Mooij.

First prize in CRM Causal Inference Challenge

An interdisciplinary team of AMLAB researchers, a biologist and a doctor won the first prize in the CRM Causal Inference Challenge (part of the Workshop Statistical Causal Inference and its Applications to Genetics, July 25 – August 19, Montreal, Canada). The team was led by Joris Mooij and consisted of AMLAB members Tom Claassen, Sara Magliacane, Philip Versteeg, Stephan Bongers, Thijs van Ommen, Patrick Forre, and external researchers Renée van Amerongen (Swammerdam Institute for Life Sciences) and Lucas van Eijk (Radboud University Medical Center). The task of the challenge was to predict values of certain phenotypic variables of knockout mice, given data from wildtype and other knockout mice.

Talk by Joris Mooij

You are all cordially invited to the AMLab colloquium on April 26 at 16:00 in C3.163, where Joris Mooij will give a talk titled “Automating Causal Discovery and Prediction“. Afterwards there are drinks and snacks!

Abstract: The discovery of causal relationships from experimental data and the construction of causal theories to describe phenomena are fundamental pillars of the scientific method. How to reason effectively with causal models, how to learn these from data, and how to obtain causal predictions has been traditionally considered to be outside of the realm of statistics. Therefore, most empirical scientists still perform these tasks informally, without the help of mathematical tools and algorithms. This traditional informal way of causal inference does not scale, and this is becoming a serious bottleneck in the analysis of the outcomes of large-scale experiments nowadays. In this talk I will describe formal causal reasoning methods and algorithms that can help to automate the process of scientific discovery from data.

Tom Claassen joins AMLab

Tom Claassen joined AMLab as a parttime postdoc (50%). Tom studied physics in Twente and worked for several years as a Systems Architect before doing his PhD on causal discovery  and logic at the Radboud University Nijmegen. Tom will work on causality as a team member of the VIDI project of Joris Mooij.

Talk by Martin Gullaksen

You are all cordially invited to a presentation on Friday, April 8th, from 16:00-17:00 in C1.112 by Martin Gullaksen on his master’s thesis entitled “Probabilistic Spatio-Temporal Inference in Early Embryonic Development. The case of Drosophila Melanogaster“. 

Abstract: Being able to infer gene regulatory networks from spatio-temporal expression
data is a major problem in biology. This thesis proposes a new dynamic Bayes
networks approach, which we benchmark by using the well researched gap gene
problem of the Drosophila melanogaster, with the capability of realistically
inferring gene regulatory networks and producing high quality simulations. The
thesis solves practical issues, currently associated with spatio-temporal gene
inference, such as computational time and parameter fragility, while obtaining
a similar gene regulatory network and matrix as our ground truth network. The
proposed modelling framework computes the gene regulatory network in 10-15
second on a modern laptop. Effectively removing the computational barrier of
the problem and allowing for future gene regulatory networks of greater gene
count to be processed. Besides producing a gene regulatory matrix our method
also produces high quality simulations of the gene activation levels of the gap
gene problem. In addition, unlike many competing problem formulations, the
proposed model is probabilistic in nature, hence allowing statistical inference
to be made. Finally, using Bayesian statistics, we perform robustness tests on
the topology of our proposed gene regulatory network and our regulatory
weights.

Talk by Errol Zalmijn (ASML)

You are all cordially invited to a presentation on Wednesday, March 9, at 11:00 in C3.163 by Errol Zalmijn, data analyst at ASML, on “Transfer entropy: an information signature of causation in ASML lithographic time series analysis“. 

Abstract: Considering the ASML lithography system to be a complex, distributed computing system that can be modeled as a network of driving and responding or driven observables i.e. cause-and-effect relationships, transfer entropy (Schreiber, 2000), an information-theoretic measure of time-directed information transfer between jointly dependent processes, enables detection of causal interactions between simultaneously observed time series from lithographic system data. Being a non-parametric measure, capable of identifying arbitrary linear and non-linear causal effects, transfer entropy can effectively gain a better understanding of the underlying system dynamics, a prerequisite for accurate diagnosis and prognosis, as well as structural design improvements.

Thijs van Ommen joins AMLab

Thijs van Ommen joined AMLab as a postdoc. Thijs studied mathematics and
computer science in Leiden and did his PhD on model selection and prediction
at the CWI. After that, he was lecturer for a Machine Learning course in
Utrecht, and will now work on causal inference in the CAFES project.

Video Club: Sackler Colloquium “Drawing Causal Inference from Big Data”

On Wednesdays 12:30-13:30 we will jointly watch some of the video recordings of presentations given at the recent Sackler Colloquium Drawing Causal Inference from Big Data of the National Academy of Sciences. (Lunch is not included – bring it yourself!)

Schedule: at the bottom of the Meetings page

Abstract: This colloquium was motivated by the exponentially growing amount of information collected about complex systems, colloquially referred to as “Big Data”. It was aimed at methods to draw causal inference from these large data sets, most of which are not derived from carefully controlled experiments. Although correlations among observations are vast in number and often easy to obtain, causality is much harder to assess and establish, partly because causality is a vague and poorly specified construct for complex systems. Speakers discussed both the conceptual framework required to establish causal inference and designs and computational methods that can allow causality to be inferred. The program illustrates state-of-the-art methods with approaches derived from such fields as statistics, graph theory, machine learning, philosophy, and computer science, and the talks will cover such domains as social networks, medicine, health, economics, business, internet data and usage, search engines, and genetics. The presentations also addressed the possibility of testing causality in large data settings, and will raise certain basic questions: Will access to massive data be a key to understanding the fundamental questions of basic and applied science? Or does the vast increase in data confound analysis, produce computational bottlenecks, and decrease the ability to draw valid causal inferences?