reproducibilityindex.ai

A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms

Authors: Yoshua Bengio, Tristan Deleu, Nasim Rahaman, Nan Rosemary Ke, Sebastien Lachapelle, Olexa Bilaniuk, Anirudh Goyal, Christopher Pal

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments in the two-variable case validate the proposed ideas and theoretical results.
Researcher Affiliation	Academia	Yoshua Bengio1, 2, 5 Tristan Deleu1 Nasim Rahaman4 Nan Rosemary Ke3 Sébastien Lachapelle1 Olexa Bilaniuk1 Anirudh Goyal1 Christopher Pal3, 5 Mila Montreal, Quebec, Canada 1 Université de Montréal, 2 CIFAR Senior Fellow, 3 École Polytechnique Montréal, 4 Max-Planck Institute for Intelligent Systems, Tübingen, 5 Canada CIFAR AI Chair
Pseudocode	Yes	Algorithm 1 Meta-learning algorithm for learning the structural parameter
Open Source Code	Yes	The source code for the experiments is available here: https://bit.ly/2M6X1al.
Open Datasets	Yes	In order to get a set of initial parameters, we ﬁrst train all 4 modules on a training distribution (p in the main text). This distribution corresponds to a ﬁxed choice of π(1) A and πB\|a (for all N possible values of a). The superscript in π(1) A emphasizes the fact that this deﬁnes the distribution prior to an intervention, with the mechanism p(B \| A) being unchanged by the intervention. These probability vectors are sampled randomly from a uniform Dirichlet distribution: π(1) A Dirichlet(1N) (65) πB\|a Dirichlet(1N) a [1, N]. (66) A pµ(A) = N(µ, σ2 = 4) (70) B := f(A) + NB NB N(0, 1),
Dataset Splits	No	The paper does not provide specific training/validation/test dataset splits needed for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	In our experiment, all the MLPs have only one hidden layer with H = 8 hidden units, with a Re LU non-linearity, and the output layer has a softmax non-linearity. The conditional distributions p(B \| A) and p(A \| B) are parametrized as 2-layer Mixture Density Networks (MDNs; Bishop, 1994), with 32 hidden units and 10 components. The marginal distributions p(A) and p(B) are parametrized as Gaussian Mixture Models (GMMs), also with 10 components. In our experiment, we used T = 20 datapoints. In our experiment, d = 100. In our experiment, θD = π/4 is ﬁxed for all our observation and intervention datasets. We choose K = 8 points in our experiments.