Learning Causal Models under Independent Changes

Authors: Sarah Mameche, David Kaltenpoth, Jilles Vreeken

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we show that our method performs well in a range of synthetic settings, on realistic gene expression simulations, as well as on real-world cell signaling data.
Researcher Affiliation Academia Sarah Mameche CISPA Helmholtz Center for Information Security sarah.mameche@cispa.de David Kaltenpoth CISPA Helmholtz Center for Information Security david.kaltenpoth@cispa.de Jilles Vreeken CISPA Helmholtz Center for Information Security jv@cispa.de
Pseudocode Yes Algorithm 1: LINC Input: data X(c), candidate DAGs G. Output: DAG G and partitions Π. and Algorithm 2: LINCCLUS Input: data X(c), variable Xi with causes XS. Output: partition Πi.
Open Source Code Yes We make the code and datasets available in the supplement.
Open Datasets Yes We make the code and datasets available in the supplement. ... we simulate data with SERGIO [19] to generate single-cell expression data... Finally, we evaluate LINC on real-world data over eleven proteins and phospholipid components... which Sachs et al. [20] added to the system in multiple experiments.
Dataset Splits No The paper describes generating synthetic data and simulations with specific parameters and sample sizes per context (e.g., "|c| = 500 samples"), but it does not specify explicit train/validation/test dataset splits (e.g., percentages or exact counts for each partition) or reference standard predefined splits.
Hardware Specification No The paper does not explicitly mention any specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments.
Software Dependencies No The paper mentions using specific software tools like "KCI test" but does not provide version numbers for any software dependencies or libraries.
Experiment Setup Yes We sample DAGs G of size |G| = 6 with edge density p = 0.3 and generate data in |C| = 5 contexts, with |c| = 500 samples and |S| = 2. ... For the synthetic data, following the literature [4, 13] we generate data in multiple contexts using ω(c) ij fij(X(c) i ) + σ(c) j N (c) j , (8) with weights ω(c) ij U(0.5, 2.5), where noise is either uniform or Gaussian with equal probability. We sample the causal functions f from {x2, x3, tanh, sinc}.