Learning Independent Causal Mechanisms

Authors: Giambattista Parascandolo, Niki Kilbertus, Mateo Rojas-Carulla, Bernhard Schölkopf

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this set of experiments we test the method presented in Section 3 on the MNIST dataset transformed with the set of mechanisms described in detail in the Appendix C... We ran the experiments 10 times with different random seeds for the initializations. Each experiment is run for 2000 iterations.
Researcher Affiliation Academia 1Max Planck Institute for Intelligent Systems 2Max Planck ETH Center for Learning Systems 3University of Cambridge.
Pseudocode Yes Algorithm 1 Learning independent mechanisms using competition of experts and adversarial training
Open Source Code No For the exact experimental parameters and architectures see the Appendix B or the Py Torch implementation we will release.
Open Datasets Yes We test our approach on MNIST digits which have undergone various transformations... To show this, we use the Omniglot dataset of letters from different alphabets (Lake et al., 2015)...
Dataset Splits No We split the training partition of MNIST in half, and transform all and only the examples in the first half; this ensures that there is no matching ground truth in the dataset, and that learning is unsupervised. As shown by the two dashed horizontal lines in Figure 6, the transformed test digits achieve a 40% accuracy... The paper mentions training and test partitions but does not provide specific percentages or counts for a reproducible train/validation/test split, nor does it explicitly mention a validation set.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions 'Py Torch implementation' and 'Adam as optimizer' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Each expert is a CNN with five convolutional layers, 32 filters per layer of size 3 3, ELU (Clevert et al., 2015) as activation function, batch normalization (Ioffe & Szegedy, 2015), and zero padding. The discriminator is also a CNN, with average pooling every two convolutional layers, growing number of filters, and a fully connected layer with 1024 neurons as last hidden layer. Both networks are trained using Adam as optimizer (Kingma & Ba, 2014), with the default hyper-parameters. A minibatch of 32 transformed MNIST digits...