Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Graphical Models via Univariate Exponential Family Distributions

Authors: Eunho Yang, Pradeep Ravikumar, Genevera I. Allen, Zhandong Liu

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our M-estimators for exponential family graphical models, speciﬁcally for the Poisson and exponential distributions, through simulations and real data examples.
Researcher Affiliation	Collaboration	Eunho Yang EMAIL IBM T.J. Watson Research Center Yorktown Heights, NY 10598, USA Pradeep Ravikumar EMAIL Department of Computer Science University of Texas, Austin Austin, TX 78712, USA Genevera I. Allen EMAIL Department of Statistics Rice University Houston, TX 77005, USA Zhandong Liu EMAIL Department of Pediatrics-Neurology Baylor College of Medicine Houston, TX 77030, USA
Pseudocode	No	The paper describes optimization problems mathematically in Appendix E, but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper states that "Optimization algorithms were implemented using projected gradient descent" in Section 4, but there is no explicit statement about releasing source code for their implementation, nor any links to a code repository.
Open Datasets	Yes	Level III breast cancer mi RNA expression (Cancer Genome Atlas Research Network, 2012) as measured by next generation sequencing was downloaded from the TCGA portal (http://tcga-data.nci.nih.gov/tcga/). We demonstrate our exponential graphical model, derived from the univariate exponential distribution, using a protein signaling example (Sachs et al., 2005).
Dataset Splits	No	For the simulation studies, the paper mentions using varying numbers of nodes p, generating i.i.d. samples, and performing 50 replicates. For real data, it specifies the total number of subjects/cells, e.g., "544 subjects and 262 mi RNAs" and "n = 7462 cells". However, it does not provide specific training, validation, or test dataset splits, nor does it refer to standard predefined splits for the real datasets.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU models, GPU models, or memory specifications used to run the experiments.
Software Dependencies	No	The paper mentions that "Optimization algorithms were implemented using projected gradient descent" and that "Gibbs sampling" was used for generating samples. However, it does not specify any software libraries or their version numbers used for implementation.
Experiment Setup	Yes	We instantiated the corresponding exponential and Poisson graphical model distributions in (15) and (14) for 4 nearest neighbor lattice graphs (d = 4), with varying number of nodes, p {64, 100, 169, 225}, and with identical edge weights for all edges: for exponential MRF, θ r = 0.1 and θ rt = 1, and, for Poisson MRF, θ r = 2 and θ rt = 0.1. We generated i.i.d. samples from these distributions using Gibbs sampling, and solved our sparsity-constrained M-estimation problem by setting λn = c/√n , following our corollaries; c = 3 for exponential MRF, and 15 for Poisson MRF. A Poisson graphical model was ﬁt to the meta-mi RNA data by performing neighborhood selection with the sparsity of the graph determined by stability selection (Liu et al., 2010).