Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Scalable high-dimensional Bayesian varying coefficient models with unknown within-subject covariance

Authors: Ray Bai, Mary R. Boland, Yong Chen

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the scalability, variable selection performance, and inferential capabilities of our method through simulations and a real data application.
Researcher Affiliation	Academia	Ray Bai EMAIL Department of Statistics University of South Carolina Columbia, SC 29201, USA Mary R. Boland EMAIL Department of Mathematics Saint Vincent College Latrobe, PA 15650, USA Yong Chen EMAIL Department of Biostatistics, Epidemiology, and Informatics University of Pennsylvania Philadelphia, PA 19104, USA
Pseudocode	Yes	Algorithm 1 ECM algorithm for MAP estimation under NVC-SSL Algorithm 2 MCMC algorithm for NVC-SSL Algorithm 3 Exact algorithm for sampling γ in Step 6 of Algorithm 2 when dp > n Algorithm 4 Approximate algorithm for sampling γ in Step 6 of Algorithm 2 when dp > n
Open Source Code	Yes	These algorithms are implemented in the publicly available R package NVCSSL on the Comprehensive R Archive Network. All of the methods in this section were implemented in the publicly available R package NVCSSL, which can be found on the Comprehensive R Archive Network.
Open Datasets	Yes	The data that we used comes from the α-factor synchronized cultures of Spellman et al. (1998) and the CHIP-chip data of Lee et al. (2002).
Dataset Splits	Yes	To assess their variable selection performance, we ﬁt these models using all n = 47 genes. We also examined the models predictive power. To do so, we randomly divided the dataset into 37 training observations and 10 test observations. We ﬁt the NVC models to the training data and then used our ﬁtted models to predict the trajectories of m RNA level by(t) for the 10 test observations and compute the out-of-sample MSPE. We repeated this procedure 200 times, so that we had 200 diﬀerent test sets on which to evaluate these diﬀerent methods.
Hardware Specification	Yes	All of our experiments were performed on an Intel Xeon 8358 Platinum processor with 2.6GHz CPU and 128 GB memory. Running the exact algorithm for 2000 iterations also took 2.3 hours for the one replicate in Figure 3, whereas the approximate algorithm only took only 6.2 minutes on an 11th Gen Intel Core i5-1135G7 processor.
Software Dependencies	No	The paper mentions "publicly available R package NVCSSL" and "R package sns" but does not specify version numbers for R or any other dependencies beyond the package itself. It states that the methods were implemented in the package, but not specific software versions of R or other libraries. For example, it does not state "R version X.X" or "PyTorch 1.9".
Experiment Setup	Yes	We compared the MSE for the posterior mean functions eβk(t) s obtained from the exact MCMC and the approximate MCMC algorithms. We also compared the average width and the empirical coverage probability (ECP) of the 95% posterior credible intervals. We looked at both the pointwise ECP (i.e. the proportion of pointwise credible intervals that contained the true value of βk(tij) for each observed time point tij 1 i n, 1 j mi) and the simultaneous ECP. Here, the simultaneous ECP was determined by the proportion of simulations where all of the posterior credible intervals covered all of the true varying coeﬃcient functions in the entire time domain. In all replications, we ran both the exact and approximate MCMC algorithms introduced in Section 4 for 2000 iterations, discarding the ﬁrst 500 iterations as burnin. The remaining 1500 MCMC samples were used to approximate the posteriors and perform uncertainty quantiﬁcation. Our MCMC algorithms were initialized with the MAP estimator obtained from the ECM algorithm, and all hyperparameters and basis dimensions were the same as those used for the ECM algorithm.