Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Bayesian Causal Structural Learning with Zero-Inflated Poisson Bayesian Networks
Authors: Junsouk Choi, Robert Chapkin, Yang Ni
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the utility of the proposed ZIPBN in causal discoveries for zero-inflated count data by simulation studies with comparison to alternative Bayesian network methods. Additionally, real single-cell RNA-sequencing data with known causal relationships will be used to assess the capability of ZIPBN for discovering causal relationships in real-world problems. |
| Researcher Affiliation | Academia | Junsouk Choi Department of Statistics Texas A&M University College Station, TX 77843 EMAIL Robert Chapkin Department of Nutrition Texas A&M University College Station, TX 77843 EMAIL Yang Ni Department of Statistics Texas A&M University College Station, TX 77843 EMAIL |
| Pseudocode | Yes | Algorithm 1 Parallel-Tempered MCMC for ZIPBN |
| Open Source Code | No | The paper states 'The code implementing the MCMC is available in the Supplementary Material.' This does not explicitly state it is open-source or provide a direct public link, which is required by the prompt's criteria. |
| Open Datasets | Yes | Additionally, real single-cell RNA-sequencing data with known causal relationships will be used to assess the capability of ZIPBN for discovering causal relationships in real-world problems. |
| Dataset Splits | No | The paper mentions 'simulated data under different samples sizes n {250, 500, 1000}' and 'retained 479 pairs for causal validation', but does not provide explicit train/validation/test splits with percentages or sample counts for reproducibility. |
| Hardware Specification | Yes | The CPU time was 1.7 hours on an i9-9880H 2.3GHz CPU. |
| Software Dependencies | No | The paper mentions 'R package Seurat (Butler et al., 2018)' but does not provide specific version numbers for R, Seurat, or any other software components. |
| Experiment Setup | Yes | For the proposed ZIPBN, we used non-informative prior by setting the hyperparameters to be (aτ, bτ) = (0.01, 0.01) and (aρ, bρ) = (0.5, 0.5)... We ran M = 10 parallel chains for 3, 000 iterations, of which the first 1, 500 iterations were discarded as burn-in. The temperatures were chosen uniformly between 0 and 1 on the log-scale, i.e., log(Tm) = (m 1)/9 for m = 1, . . . , 10. The swapping probability ps was chosen to be 10%. |