reproducibilityindex.ai

Gaussian Process Bandits for Top-k Recommendations

Authors: Mohit Yadav, Cameron Musco, Daniel R. Sheldon

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Additionally, empirical results using a bandit simulator demonstrate that the proposed algorithm outperforms other baselines across various scenarios.This section empirically evaluates the proposed GP-Top K bandit algorithms for the top-k recommendations using a simulation based on the Movie Lens dataset [4].
Researcher Affiliation	Academia	Mohit Yadav University of Massachusetts Amherst ymohit@cs.umass.edu Daniel Sheldon University of Massachusetts Amherst sheldon@cs.umass.edu Cameron Musco University of Massachusetts Amherst cmusco@cs.umass.edu
Pseudocode	Yes	Algorithm 1 Contextual Bandit Algorithm for Top-k Recommendations. Algorithm 2 Computing Weighted Convolutional Kendall Kernel. Algorithm 3 Computing Convolutional Kendall Kernel [10].
Open Source Code	No	Our code can be accessed using this hyper-link. (The hyperlink itself is not present in the PDF text, thus not providing concrete access.)
Open Datasets	Yes	using a simulation based on the Movie Lens dataset [4]. [4] F Maxwell Harper and Joseph A Konstan. The Movielens datasets: History and context. In Transactions on Interactive Intelligent Systems, volume 5, pages 1 19. ACM, 2015.
Dataset Splits	Yes	We consider a 1M variant of the Movie Lens dataset, which contains 1 million ratings from 6040 users for 3677 items. Both context and item embeddings, i.e., cu and θi, are 5-dimensional, optimized by considering the 5-fold performance on this dataset.
Hardware Specification	Yes	We utilized multiple NVIDIA Tesla M40 GPUs with 40 GB RAM on our in-house cluster for our experiments.
Software Dependencies	No	The paper describes its experimental setup and methods but does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	For setting up the reward functions, we utilize a similarity function s(c, θ) := ς(a (c T θ) b) to measure similarity between any user and item embeddings, where a and b are similarity score and shift scalars, respectively. We set a and b to 6 and 0.3, respectively, to fully utilize the range of the similarity function, as assessed by evaluating its value for many arms. We set λ = 0.75 to emphasize relevance over diversity. For the ϵ-greedy baselines, we considered various values of ϵ are considered, specifically ϵ = {0.01, 0.05, 0.1}. For MAB-UCB baseline, ...βmab values within the set {0.1, 0.25, 0.5}... For the parameters of proposed GP-Top K bandit algorithms, we set βt = βgp log(\|X\| t2 π2) with βgp {0.05, 0.1, 0.5}. The selection of σ for all variants is determined by optimizing the log-likelihood of the observed after every 10 rounds by considering values in the set {0.01, 0.05, 0.1}. We use 10 restarts and 5 steps in each search direction for the local search, starting with 1000 initial candidates.