Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On Learning Necessary and Sufficient Causal Graphs
Authors: Hengrui Cai, Yixin Wang, Michael Jordan, Rui Song
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Across empirical studies of simulated and real data, we demonstrate that NSCSL outperforms existing algorithms and can reveal crucial yeast genes for target heritable traits of interest.Empirical studies in simulated datasets show that NSCSL outperforms existing algorithms in distilling relevant subgraphs for outcomes of interest; NSCSL can also identify important quantitative trait loci for the yeast and the causal protein signaling network for single cell data, as demonstrated in real data analyses.6 Experiments |
| Researcher Affiliation | Academia | Hengrui Cai University of California, Irvine EMAIL Wang University of Michigan EMAIL I. Jordan University of California, Berkeley EMAIL Song North Carolina State University EMAIL |
| Pseudocode | No | The paper describes the steps of the NSCSL algorithm in Section 5.2 but does not provide a formal pseudocode block or algorithm box. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or provide any links to a code repository. |
| Open Datasets | Yes | We conduct real data analysis using the benchmark data from Sachs et al. [30].Furthermore, we apply NSCSL to gene expression traits in yeast [2] using a dataset of 104 yeast segregants with diverse genotypes. |
| Dataset Splits | No | The paper mentions evaluating methods over 50 replications and with different sample sizes (n1, n2), but it does not provide specific training/validation/test dataset splits (e.g., percentages or exact counts) for reproducibility. |
| Hardware Specification | No | The experiments are conducted on a Google Cloud Platform virtual machine with 8 processor cores and 32GB memory.This describes the general type of machine and core count but lacks specific CPU models (e.g., Intel Xeon E5) or GPU details, which are necessary for precise hardware specification. |
| Software Dependencies | No | The paper mentions using NOTEARS and Adam optimizer but does not specify any software names with version numbers for libraries, frameworks, or environments (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | Here, we use a graph threshold of 0.3 (commonly used in the literature [46; 44; 48; 5]) to prune the noise edges for a fair comparison. The training details are provided in Table E.1. |