Latent Bandits Revisited
Authors: Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Craig Boutilier
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A comprehensive empirical study showcases the advantages of our approach. and Finally, in Section 5, we demonstrate their effectiveness in synthetic simulations and on a large-scale real-world dataset. |
| Researcher Affiliation | Industry | Joey Hong Google Research jxihong@google.com Branislav Kveton Google Research bkveton@google.com Manzil Zaheer Google Research manzilzaheer@google.com Yinlam Chow Google Research yinlamchow@google.com Amr Ahmed Google Research amra@google.com Craig Boutilier Google Research cboutilier@google.com |
| Pseudocode | Yes | Algorithm 1 m UCB, Algorithm 2 m TS, Algorithm 3 mm TS |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We also assess the performance of our algorithms on the Movie Lens 1M dataset [17] and citation [17] F. Maxwell Harper and Joseph A. Konstan. The Movie Lens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (Tii S), 2015. |
| Dataset Splits | No | The paper states 'We randomly select 50% of all ratings as our training set and use the remaining 50% as the test set;' but does not explicitly mention a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments, only general statements like 'synthetic simulations' and 'large-scale real-world dataset'. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | We evaluate each algorithm on 500 independent runs, with a uniformly sampled latent state in each run, and report the average reward over time., The rewards are drawn i.i.d. from P( | a, s) = N(µ(a, s), σ2) with σ = 0.5., We randomly select 50% of all ratings as our training set and use the remaining 50% as the test set; resulting in sparse rating matrices Mtrain and Mtest. We complete each matrix using least-squares matrix completion [29] with rank 20. This rank is high enough to yield a low prediction error, and yet small enough to avoid overfitting., Using k-means clustering on the rows of U, we cluster users into 5 clusters, where 5 is the largest value that does not yield empty clusters. |