Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Bayesian Causal Structural Learning with Zero-Inflated Poisson Bayesian Networks

Authors: Junsouk Choi, Robert Chapkin, Yang Ni

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the utility of the proposed ZIPBN in causal discoveries for zero-inﬂated count data by simulation studies with comparison to alternative Bayesian network methods. Additionally, real single-cell RNA-sequencing data with known causal relationships will be used to assess the capability of ZIPBN for discovering causal relationships in real-world problems.
Researcher Affiliation	Academia	Junsouk Choi Department of Statistics Texas A&M University College Station, TX 77843 EMAIL Robert Chapkin Department of Nutrition Texas A&M University College Station, TX 77843 EMAIL Yang Ni Department of Statistics Texas A&M University College Station, TX 77843 EMAIL
Pseudocode	Yes	Algorithm 1 Parallel-Tempered MCMC for ZIPBN
Open Source Code	No	The paper states 'The code implementing the MCMC is available in the Supplementary Material.' This does not explicitly state it is open-source or provide a direct public link, which is required by the prompt's criteria.
Open Datasets	Yes	Additionally, real single-cell RNA-sequencing data with known causal relationships will be used to assess the capability of ZIPBN for discovering causal relationships in real-world problems.
Dataset Splits	No	The paper mentions 'simulated data under different samples sizes n {250, 500, 1000}' and 'retained 479 pairs for causal validation', but does not provide explicit train/validation/test splits with percentages or sample counts for reproducibility.
Hardware Specification	Yes	The CPU time was 1.7 hours on an i9-9880H 2.3GHz CPU.
Software Dependencies	No	The paper mentions 'R package Seurat (Butler et al., 2018)' but does not provide specific version numbers for R, Seurat, or any other software components.
Experiment Setup	Yes	For the proposed ZIPBN, we used non-informative prior by setting the hyperparameters to be (aτ, bτ) = (0.01, 0.01) and (aρ, bρ) = (0.5, 0.5)... We ran M = 10 parallel chains for 3, 000 iterations, of which the ﬁrst 1, 500 iterations were discarded as burn-in. The temperatures were chosen uniformly between 0 and 1 on the log-scale, i.e., log(Tm) = (m 1)/9 for m = 1, . . . , 10. The swapping probability ps was chosen to be 10%.