Efficient Thompson Sampling for Online Matrix-Factorization Recommendation

Authors: Jaya Kawale, Hung H. Bui, Branislav Kveton, Long Tran-Thanh, Sanjay Chawla

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments in collaborative filtering using several real-world datasets demonstrate that PTS significantly outperforms the current state-of-the-arts.
Researcher Affiliation Collaboration Jaya Kawale, Hung Bui, Branislav Kveton Adobe Research San Jose, CA {kawale, hubui, kveton}@adobe.com Long Tran Thanh University of Southampton Southampton, UK ltt08r@ecs.soton.ac.uk Sanjay Chawla Qatar Computing Research Institute, Qatar University of Sydney, Australia sanjay.chawla@sydney.edu.au
Pseudocode Yes Algorithm 1 Particle Thompson Sampling for Matrix Factorization (PTS)
Open Source Code No The paper does not provide any links to open-source code for the described methodology or explicitly state that the code is publicly available.
Open Datasets Yes We use five real world datasets as follows: Movielens 100k, Movielens 1M, Yahoo Music4, Book crossing5 and Each Movie as shown in Table 1. ... 4http://webscope.sandbox.yahoo.com/ 5http://www.bookcrossing.com
Dataset Splits No We ran the algorithm using 80% data for training and the rest for testing and computed the MSE by averaging the results over 5 runs. The paper specifies a training and testing split, but does not mention a separate validation set or split.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No The paper mentions using 'PMF implementation by [5]' but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes We use K = 2 for all the algorithms and use 30 particles for our approach. For PTS we set the value of σ2 = 0.5 and σ2 u = 1, σ2 v = 1. For PTS-B (Bayesian version, see Algo. 1 for more details), we set σ2 = 0.5 and the initial shape parameters of the Gamma distribution as α = 2 and β = 0.5. For both ICF-20 and ICF-50, we set σ2 = 0.5 and σ2 u = 1. ... We use the stochastic gradient descent to update the latent factors with a mini-batch size of 50. In order to make a recommendation, we use the ϵ-greedy strategy and recommend the highest Ui V T with a probability ϵ and make a random recommendations otherwise. (ϵ is set as 0.95 in our experiments.)