Nonparametric Identifiability of Causal Representations from Unknown Interventions

Authors: Julius von Kügelgen, Michel Besserve, Liang Wendong, Luigi Gresele, Armin Kekić, Elias Bareinboim, David Blei, Bernhard Schölkopf

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We sketch possible learning objectives ( 5), and empirically investigate training different generative models ( 6), finding that only those based on the correct causal structure attain the best fit and identify the ground truth.
Researcher Affiliation Academia 1Max Planck Institute for Intelligent Systems, Tübingen, Germany 2Department of Engineering, University of Cambridge, United Kingdom 3Columbia University, USA
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code to reproduce our experiments is available at: https://github.com/akekic/causal-component-analysis.
Open Datasets No The paper describes generating synthetic data for its experiments: "Synthetic Data Generating Process. We consider linear Gaussian latent SCMs of the form V1 := U1, V2 := αV1 + U2... We generate different latent SCMs by drawing α uniformly from [ 10, 2] [2, 10]... We generate the corresponding mixing functions by uniformly sampling each element of the weight matrices..."
Dataset Splits Yes We split the dataset into 70% for training, and 15% for validation and held-out test data, each sampled randomly across all environments.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., specific GPU or CPU models).
Software Dependencies No The paper mentions using specific software components like "normalizing flows", "Neural Spline Flows", and "ADAM optimizer" but does not specify their version numbers.
Experiment Setup Yes We use Neural Spline Flows [30] for the invertible transformation, with a 3-layer feedforward neural network with hidden dimension 128 and permutation in each flow layer and L = 12 layers... Each environment comprises a total of 200k data points. We use the ADAM optimizer [67] with cosine annealing learning rate scheduling, starting with a learning rate of 5 10 3 and ending with 1 10 7. We train the model for 200 epochs with a batch size of 4096.