Online Corrupted User Detection and Regret Minimization
Authors: Zhiyong Wang, Jize Xie, Tong Yu, Shuai Li, John C.S. Lui
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With extensive experiments, our methods achieve superior performance over previous bandit algorithms and high corrupted user detection accuracy. |
| Researcher Affiliation | Collaboration | Zhiyong Wang The Chinese University of Hong Kong zywang21@cse.cuhk.edu.hk Jize Xie Shanghai Jiao Tong University xjzzjl@sjtu.edu.cn Tong Yu Adobe Research worktongyu@gmail.com Shuai Li Shanghai Jiao Tong University shuaili8@sjtu.edu.cn John C.S. Lui The Chinese University of Hong Kong cslui@cse.cuhk.edu.hk |
| Pseudocode | Yes | Algorithm 1 RCLUB-WCU |
| Open Source Code | No | The paper does not provide explicit statements or links for the open-source code of the described methodology. |
| Open Datasets | Yes | We use three real-world data Movielens [11], Amazon[31], and Yelp [33]. |
| Dataset Splits | No | The paper describes data generation and processing, but does not explicitly mention train/validation/test dataset splits with percentages or counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We use u = 1, 000 users and m = 10 clusters, where each cluster contains 100 users. We randomly select 100 users as the corrupted users. The preference and arm (item) vectors are drawn in d 1 (d = 50) dimensions with each entry a standard Gaussian variable and then normalized, added one more dimension with constant 1, and divided by 2 [21]. We fix an arm set with |A| = 1000 items, at each round, 20 items are randomly selected to form a set At to choose from. Following [40, 3], in the first k rounds, we always flip the reward of corrupted users by setting rt = x T atθit,t + ηt. And we leave the remaining T k rounds intact. Here we set T = 1, 000, 000 and k = 20, 000. |