Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Learning Necessary and Sufficient Causal Graphs

Authors: Hengrui Cai, Yixin Wang, Michael Jordan, Rui Song

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Across empirical studies of simulated and real data, we demonstrate that NSCSL outperforms existing algorithms and can reveal crucial yeast genes for target heritable traits of interest.Empirical studies in simulated datasets show that NSCSL outperforms existing algorithms in distilling relevant subgraphs for outcomes of interest; NSCSL can also identify important quantitative trait loci for the yeast and the causal protein signaling network for single cell data, as demonstrated in real data analyses.6 Experiments
Researcher Affiliation Academia Hengrui Cai University of California, Irvine EMAIL Wang University of Michigan EMAIL I. Jordan University of California, Berkeley EMAIL Song North Carolina State University EMAIL
Pseudocode No The paper describes the steps of the NSCSL algorithm in Section 5.2 but does not provide a formal pseudocode block or algorithm box.
Open Source Code No The paper does not contain any statement about releasing source code or provide any links to a code repository.
Open Datasets Yes We conduct real data analysis using the benchmark data from Sachs et al. [30].Furthermore, we apply NSCSL to gene expression traits in yeast [2] using a dataset of 104 yeast segregants with diverse genotypes.
Dataset Splits No The paper mentions evaluating methods over 50 replications and with different sample sizes (n1, n2), but it does not provide specific training/validation/test dataset splits (e.g., percentages or exact counts) for reproducibility.
Hardware Specification No The experiments are conducted on a Google Cloud Platform virtual machine with 8 processor cores and 32GB memory.This describes the general type of machine and core count but lacks specific CPU models (e.g., Intel Xeon E5) or GPU details, which are necessary for precise hardware specification.
Software Dependencies No The paper mentions using NOTEARS and Adam optimizer but does not specify any software names with version numbers for libraries, frameworks, or environments (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes Here, we use a graph threshold of 0.3 (commonly used in the literature [46; 44; 48; 5]) to prune the noise edges for a fair comparison. The training details are provided in Table E.1.