Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Covariate-moderated Empirical Bayes Matrix Factorization

Authors: William Denault, Karl Tayeb, Peter Carbonetto, Jason Willwerscheid, Matthew Stephens

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To assess the benefits of c EBMF, we compared c EBMF with other matrix factorization methods in simulated data sets. We compared with several methods that do not use side information, including EBMF (flashier R package [13, 35]), penalized matrix decomposition ( PMD ; PMA R package [39]), and a variational autoencoder (VAE) [26] implemented in Py Torch [46]. We also compared with other methods that use side information, including MFAI (mfair R package [17]), Spatial PCA [11], conditional VAE (c VAE) [27], and neural collaborative filtering (NCF) [28]. c VAE and NCF were also implemented in Py Torch.
Researcher Affiliation	Academia	Departments of Statistics and Human Genetics University of Chicago Chicago, IL 60637, USA EMAIL Jason Willwerscheid Mathematics and Computer Science Providence College Providence, RI 02918, USA EMAIL
Pseudocode	Yes	Algorithm 1 c EBMF algorithm Require: n p data matrix, Z; covariate or side information matrices, X (n nx) and Y (p ny); K, the number of factors; the prior families Gℓ,k and Gf,k, k = 1, . . . , K; and initial estimates of the first and second moments of L (n K), F (p K), which are denoted by L, F, L2, F2. Compute the expected residuals, R = Z L FT . Algorithm 2 c EBMF single-factor update Require: covariate or side information matrices, X (n nx) and Y (p ny); k {1, . . . , K}, the dimension to update; the prior families Gℓ,k and Gf,k; an implementation of c EBNM( ˆβ, s, D, G) (ˆθ, ˆq) (eq. 13) for prior families G = Gℓ,k and G = Gf,k; the expected residuals, Rk; estimates of the second moments, L2 (n K), F2 (p K); and the residuals variances, τ.
Open Source Code	Yes	A Py Torch-based implementation of c EBMF with flexible priors is available at https:// github.com/william-denault/cebmf_torch. Note: R and Python code implementing the experiments is available at https://github.com/ william-denault/c EBMF_experiment, and a Py Torch-based implementation of c EBMF is available at at https://github.com/william-denault/cebmf_torch.
Open Datasets	Yes	To provide a quantitative assessment of the matrix factorization methods in real data, we ran the same methods on the Movie Lens 100K data [52], a standard collaborative filtering benchmark in which the goal is to predict the unobserved elements of the matrix. Although c EBMF was not specifically designed for spatial data, here we show that c EBMF also yields compelling results from spatial transcriptomics data [8] by exploiting the side information, the spatial locations of the data points. We illustrate this using a data set [53, 54] that has been annotated by domain experts and has been used in several papers to benchmark methods for spatial transcriptomics (e.g., [11, 55 57]).
Dataset Splits	Yes	We held out some of the moving ratings at random, and used these held-out ratings as a test set. Figure 3: Prediction performance of different matrix factorization and deep learning methods in the Movie Lens 100K data [52]. Training proportion = X% means that X% of the movie ratings were used in training, and the remaining (100 X)% were used to evaluate accuracy (measured using RMSE). The results at each training proportion are from 10 random training-test splits.
Hardware Specification	Yes	Note this benchmark was performed on a computer with 32 GB memory, an NVIDIA Ge Force RTXTM 4070 GPU and an AMD Ryzen TM 9 7940HS CPU (8 cores, 16 threads).
Software Dependencies	No	A Py Torch-based [46] implementation of c EBMF with flexible priors is available at https://github.com/william-denault/cebmf_torch. In the simulations, c EBMF was implemented in R, in which learning the parameterized priors was performed using the Keras R interface [71] to Tensor Flow [72].
Experiment Setup	Yes	All the models were trained for 50 epochs using the Adam optimizer with learning rate 0.001 and batch size 64. VAE had three hidden layers (of width 128, 64 and 30) in both the encoder and decoder (20 hidden dimensions). Re LU activations were used throughout. For the spatial transcriptomics data, we fit c EBMF and EBMF using gene-specific residual variances, σ2 ij = σ2 j . We used mixture-of-exponential priors for F, and the parameterized mixture-of-exponential priors (49) for L in which the mixture weights were learned using a multilayer perceptron instead of a multinomial regression. The multilayer perceptions were defined as sequential models with a dense layer with 64 units and Re LU activations. We use two subsequent dense layers, each with 64 units, and Re LU activations using an L2 regularization coefficient of 0.001 to prevent overfitting. These regularized layers were followed by a dropout layer (with a dropout rate of 0.5). The subsequent layers were four dense layers each with 64 units, Re LU activations and L2 regularization coefficient of 0.001. The final layer was a dense layer with a softmax activation. These models were trained during each single-factor update using 300 epochs and a batch size of 1,500.