Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Online Clustering of Bandits with Misspecified User Models

Authors: Zhiyong Wang, Jize Xie, Xutong Liu, Shuai Li, John C.S. Lui

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both synthetic and real-world data show our outperformance over previous algorithms. This section compares RCLUMB and RSCLUMB with CLUB [12], SCLUB [27], Lin UCB with a single estimated vector for all users, Lin UCB-Ind with separate estimated vectors for each user, and two modifications of Lin UCB in [23] which we name as RLin UCB and RLin UCB-Ind. We use averaged reward as the evaluation metric, where the average is taken over ten independent trials.
Researcher Affiliation Academia Zhiyong Wang The Chinese University of Hong Kong EMAIL Jize Xie Shanghai Jiao Tong University EMAIL Xutong Liu The Chinese University of Hong Kong EMAIL Shuai Li Shanghai Jiao Tong University EMAIL John C.S. Lui The Chinese University of Hong Kong EMAIL
Pseudocode Yes Algorithm 1 Robust Clustering of Misspecified Bandits Algorithm (RCLUMB)
Open Source Code No The paper does not contain any explicit statement about making the source code publicly available or a link to a code repository.
Open Datasets Yes We conduct experiments on the Yelp data and the 20m Movie Lens data [17].
Dataset Splits No The paper describes data generation and experimental setup for synthetic and real-world datasets, but it does not specify explicit training, validation, or test dataset splits or percentages.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or memory used for running experiments.
Software Dependencies No The paper does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, or specific libraries).
Experiment Setup Yes Input: Deletion parameter α1, α2 > 0, f(T) = q 1+T , λ, β, ϵ > 0. We consider a setting with u = 1,000 users, m = 10 clusters and T = 10^6 rounds. The preference and feature vectors are in d = 50 dimension with each entry drawn from a standard Gaussian distribution, and are normalized to vectors with . 2 = 1 [27]. We fix an arm set with |A| = 1000 items, at each round t, 20 items are randomly selected to form a set At for the user to choose from. We construct a matrix ϵ R^1,000x1,000 in which each element ϵ(i, j) is drawn uniformly from the range (-0.2, 0.2) to represent the deviation.