reproducibilityindex.ai

Invariant Causal Representation Learning for Out-of-Distribution Generalization

Authors: Chaochao Lu, Yuhuai Wu, José Miguel Hernández-Lobato, Bernhard Schölkopf

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both synthetic and real-world datasets demonstrate that our approach outperforms a variety of baseline methods. 5 EXPERIMENTS We compare our approach with a variety of methods on both synthetic and real-world datasets. In all comparisons, unless stated otherwise, we average performance over ten runs.
Researcher Affiliation	Collaboration	1University of Cambridge, 2MPI for Intelligent Systems, 3Stanford University, 4Google Research, 5The Alan Turing Institute
Pseudocode	Yes	Algorithm 1: Invariant Causal Representation Learning (i Ca RL) Phase 1: We ﬁrst learn a NF-i VAE model, including the decoder and its corresponding encoder, by optimizing the objective function in (10) on the data {X, Y , E}. Then, we use the mean of the NF-i VAE encoder to infer the latent variables Z from observations {X, Y , E}. The latent variables are guaranteed to be identiﬁed up to a permutation and simple transformation. Phase 2: After inferring Z, we ﬁrst conduct the PC algorithm to learn a Markov equivalence class of DAGs, and then discover direct causes (parents) of Y among its neighbors by testing all pairs of latent variables with (conditional) independence testing, i.e., ﬁnding a set of latent variables in which each pair of Zi and Zj satisﬁes that the dependency between them increases after additionally conditioning on Y . Phase 3: Having obtained Pa(Y ), we can solve (11) to learn the invariant classiﬁer w. When in a new environment, we ﬁrst infer Pa(Y ) from X by solving (12) and then leverage the learned w for prediction.
Open Source Code	No	The paper does not include an explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets	Yes	We use the exact same environment as in Arjovsky et al. (2019). Arjovsky et al. (2019) propose to create an environment for training to classify digits in MNIST data. We modify the fashion MNIST dataset in a manner similar to the MNIST digits dataset. We also report the results on one of the widely used realistic datasets for OOD generalization: VLCS (Fang et al., 2013). We report the results on another one of the widely used realistic datasets for OOD generalization: PACS (Li et al., 2017a).
Dataset Splits	Yes	There are three environments (two training containing 30,000 points each, one test containing 10,000 points) We add noise to the preliminary label ( y = 0 if the digit is between 0-4 and y = 1 if the digit is between 5-9) by ﬂipping it with 25 percent probability to construct the ﬁnal labels. We used the exact experimental setting that is described in Gulrajani & Lopez-Paz (2020). Speciﬁcally, we trained our model over all possible train and test environment combination for one of the commonly used hyper-parameter tuning procedure: train domain validation.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for running the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions implementing parts of the model using PyTorch (Paszke et al., 2019) or TensorFlow (Abadi et al., 2015), but it does not specify the exact version numbers of these software dependencies used for the experiments.
Experiment Setup	Yes	N HYPERPARAMETERS AND ARCHITECTURES In this section, we describe the hyperparameters and architectures of different models used in different experiments. Unless stated otherwise, we have λ1 = 1 and λ2 = 1, both of which are selected on training/validation data. N.1 SYNTHETIC DATA We used Adam optimizer for training with learning rate set to 1e-3 and batch size set to 128. N.2 CMNIST AND CFMNIST For example, the batch size is set to 256, and the learning rate is 10 4.