Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal discovery from observational and interventional data across multiple environments

Authors: Adam Li, Amin Jaber, Elias Bareinboim

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we develop a fundamental approach to structure learning in non-Markovian systems (i.e. when there exist latent confounders) leveraging observational and interventional data collected from multiple domains. Specifically, we start by showing that learning from observational data in multiple domains is equivalent to learning from interventional data with unknown targets in a single domain. ... Leveraging the S-Markov property, we introduce a new constraint-based causal discovery algorithm, S-FCI, that can learn from observational and interventional data from different domains. We prove that the algorithm is sound and subsumes existing constraint-based causal discovery algorithms.
Researcher Affiliation Collaboration Adam Li Department of Computer Science, Columbia University... Synlico Inc. EMAIL... Elias Bareinboim Department of Computer Science, Columbia University
Pseudocode Yes Algorithm 1 S-FCI: Algorithm for Learning a S-PAG
Open Source Code Yes 1Our algorithm is implemented in open-source MIT-Licensed https://github.com/py-why/dodiscover.
Open Datasets No The paper introduces a theoretical framework and algorithm with conceptual examples but does not report empirical experiments using a specific dataset, nor does it provide concrete access information for any dataset used in an empirical training capacity.
Dataset Splits No The paper focuses on theoretical development and does not describe empirical experiments or specific dataset splits (training, validation, test) for reproducibility.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, memory, or cloud instances) used for computation or running examples.
Software Dependencies No The paper mentions its algorithm is implemented in 'dodiscover' but does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch version).
Experiment Setup No The paper describes a theoretical framework and an algorithm but does not detail any specific experimental setup, hyperparameters, or system-level training settings as it does not report empirical experiments.