Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Latent Bandits Revisited
Authors: Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Craig Boutilier
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A comprehensive empirical study showcases the advantages of our approach. and Finally, in Section 5, we demonstrate their effectiveness in synthetic simulations and on a large-scale real-world dataset. |
| Researcher Affiliation | Industry | Joey Hong Google Research EMAIL Branislav Kveton Google Research EMAIL Manzil Zaheer Google Research EMAIL Yinlam Chow Google Research EMAIL Amr Ahmed Google Research EMAIL Craig Boutilier Google Research EMAIL |
| Pseudocode | Yes | Algorithm 1 m UCB, Algorithm 2 m TS, Algorithm 3 mm TS |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We also assess the performance of our algorithms on the Movie Lens 1M dataset [17] and citation [17] F. Maxwell Harper and Joseph A. Konstan. The Movie Lens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (Tii S), 2015. |
| Dataset Splits | No | The paper states 'We randomly select 50% of all ratings as our training set and use the remaining 50% as the test set;' but does not explicitly mention a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments, only general statements like 'synthetic simulations' and 'large-scale real-world dataset'. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | We evaluate each algorithm on 500 independent runs, with a uniformly sampled latent state in each run, and report the average reward over time., The rewards are drawn i.i.d. from P( | a, s) = N(µ(a, s), σ2) with σ = 0.5., We randomly select 50% of all ratings as our training set and use the remaining 50% as the test set; resulting in sparse rating matrices Mtrain and Mtest. We complete each matrix using least-squares matrix completion [29] with rank 20. This rank is high enough to yield a low prediction error, and yet small enough to avoid overfitting., Using k-means clustering on the rows of U, we cluster users into 5 clusters, where 5 is the largest value that does not yield empty clusters. |