Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Causal Models under Independent Changes
Authors: Sarah Mameche, David Kaltenpoth, Jilles Vreeken
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we show that our method performs well in a range of synthetic settings, on realistic gene expression simulations, as well as on real-world cell signaling data. |
| Researcher Affiliation | Academia | Sarah Mameche CISPA Helmholtz Center for Information Security EMAIL David Kaltenpoth CISPA Helmholtz Center for Information Security EMAIL Jilles Vreeken CISPA Helmholtz Center for Information Security EMAIL |
| Pseudocode | Yes | Algorithm 1: LINC Input: data X(c), candidate DAGs G. Output: DAG G and partitions Π. and Algorithm 2: LINCCLUS Input: data X(c), variable Xi with causes XS. Output: partition Πi. |
| Open Source Code | Yes | We make the code and datasets available in the supplement. |
| Open Datasets | Yes | We make the code and datasets available in the supplement. ... we simulate data with SERGIO [19] to generate single-cell expression data... Finally, we evaluate LINC on real-world data over eleven proteins and phospholipid components... which Sachs et al. [20] added to the system in multiple experiments. |
| Dataset Splits | No | The paper describes generating synthetic data and simulations with specific parameters and sample sizes per context (e.g., "|c| = 500 samples"), but it does not specify explicit train/validation/test dataset splits (e.g., percentages or exact counts for each partition) or reference standard predefined splits. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments. |
| Software Dependencies | No | The paper mentions using specific software tools like "KCI test" but does not provide version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We sample DAGs G of size |G| = 6 with edge density p = 0.3 and generate data in |C| = 5 contexts, with |c| = 500 samples and |S| = 2. ... For the synthetic data, following the literature [4, 13] we generate data in multiple contexts using ω(c) ij fij(X(c) i ) + σ(c) j N (c) j , (8) with weights ω(c) ij U(0.5, 2.5), where noise is either uniform or Gaussian with equal probability. We sample the causal functions f from {x2, x3, tanh, sinc}. |