Reward Imputation with Sketching for Contextual Batched Bandits

Authors: Xiao Zhang, Ninglu Shao, Zihua Si, Jun Xu, Wenhan Wang, Hanjing Su, Ji-Rong Wen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that SPUIR outperforms state-of-the-art baselines on synthetic, public benchmark, and real-world datasets. [...] We carried out extensive experiments on a synthetic dataset, the publicly available Criteo dataset, and a dataset from a commercial app to demonstrate our performance, empirically analyzed the influence of different parameters, and verified the correctness of the theoretical results.
Researcher Affiliation Collaboration Xiao Zhang1,2, Ninglu Shao1,2, , Zihua Si1,2, , Jun Xu1,2, , Wenhan Wang3, Hanjing Su3, Ji-Rong Wen1,2 1 Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 3 Tencent Inc., Shenzhen, China {zhangx89, ninglu_shao, zihua_si, junxu, jrwen}@ruc.edu.cn {justinsu, ezewang}@tencent.com
Pseudocode Yes Algorithm 2 Sketched Policy Updating with Imputed Rewards (SPUIR) in the (n + 1)-th episode
Open Source Code No The paper does not provide any specific links or statements indicating that the source code for the methodology is openly available.
Open Datasets Yes We empirically evaluated the performance of our algorithms on 3 datasets: the synthetic dataset, publicly available Criteo dataset6 (Criteo-recent, Criteo-all), and dataset collected from Tencent s We Chat app for coupon recommendation (commercial product). Footnote 6: https://labs.criteo.com/2013/12/conversion-logs-dataset/
Dataset Splits No The paper mentions using different datasets for evaluation but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages or exact counts for each split).
Hardware Specification Yes We applied the algorithms to CBB setting and implemented on Intel(R) Xeon(R) Silver 4114 CPU@2.20GHz, and repeated the experiments 20 times.
Software Dependencies No The paper does not specify any software names with version numbers, such as programming languages, libraries, or frameworks used for implementation or experimentation.
Experiment Setup Yes According to Remark 4, we set the batch size as B = C2 BN/d, the constant CB 25, and the sketch size c = 150 on all the datasets.