Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Prior Specification for Exposure-based Bayesian Matrix Factorization
Authors: Zicong Zhu, Issei Sato
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study, we present an enhanced method for specifying priors in Bayesian matrix factorization models. We improve the estimators by implementing an exposure-based model to better simulate data scarcity. Our method demonstrates significant accuracy improvements in hyperparameter estimation during synthetic experiments. We also explore the feasibility of applying this method to real-world datasets and provide insights into how the model s behavior adapts to varying levels of data sparsity. [...] We conducted experiments on synthetic datasets, demonstrating that our new estimators outperform existing methods, especially as the dataset becomes sparser. |
| Researcher Affiliation | Academia | Zicong Zhu EMAIL Department of Computer Science The University of Tokyo Issei Sato EMAIL Department of Computer Science The University of Tokyo |
| Pseudocode | No | The paper describes the model definitions and derivations mathematically and textually, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We conducted additional experiments on real-world datasets Movie Lens (Harper & Konstan, 2015), which has been widely studied for recommender systems. |
| Dataset Splits | No | We first generate the synthetic data with the following 3 steps repeatedly: (1) We sample the matrix P and Q with the prior hyperparameters for particular specifications; (2) We recover the fully dense matrix R by the product of P and Q; (3) We sample the Bernoulli variables Oij with different sparsity levels and multiply them with each entry of the dense matrix R to obtain the sparse observation matrix Y . [...] We selected three Movie Lens datasets with different sizes, from 100k records to 10m records. The datasets contain users ratings of different movies on a 5-star scale, with half-star increments (0.5 stars 5.0 stars). While the paper describes the generation of synthetic data and the characteristics of the MovieLens datasets, it does not specify explicit training/test/validation splits for its experiments or how the MovieLens data was partitioned for the evaluation of the estimators. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | We conduct the experiments with specifications A, D, and F because they are distinct from each other. The full specification setup defined by da Silva et al. (2023) is described in Table 4. In specification A, matrices P and Q share the same prior parameters, but their shape parameters are 10 times larger than their rate parameters. [...] Table 1: Hyperparameters Initialization for Different Specifications. Spec. a b c d ยตp ฯp ยตq ฯq E[R] V[R] A 10 1 10 1 10.0 3.16 10.0 3.16 2500.00 55000.00 D 0.1 1 0.1 1 0.1 0.32 0.1 0.32 0.25 0.55 F 1 1 0.1 0.1 1.0 1.0 1.0 3.16 25.00 550.00 [...] Table 2: Variables of Experiment Setups Prior Spec. K (Num. of Latent Factors) Pobs. (Parameter of Bernoulli distribution) [A, D, F] [25, 50, 75, 100, 125, 150] Group 1: [1.0, 0.98, 0.96, 0.94, 0.92, 0.90] Group 2: [0.5, 0.4, 0.3, 0.2, 0.1] Group 3: [0.05, 0.04, 0.03, 0.02, 0.01] |