Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Reward Imputation with Sketching for Contextual Batched Bandits
Authors: Xiao Zhang, Ninglu Shao, Zihua Si, Jun Xu, Wenhan Wang, Hanjing Su, Ji-Rong Wen
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that SPUIR outperforms state-of-the-art baselines on synthetic, public benchmark, and real-world datasets. [...] We carried out extensive experiments on a synthetic dataset, the publicly available Criteo dataset, and a dataset from a commercial app to demonstrate our performance, empirically analyzed the influence of different parameters, and verified the correctness of the theoretical results. |
| Researcher Affiliation | Collaboration | Xiao Zhang1,2, Ninglu Shao1,2, , Zihua Si1,2, , Jun Xu1,2, , Wenhan Wang3, Hanjing Su3, Ji-Rong Wen1,2 1 Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 3 Tencent Inc., Shenzhen, China EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 2 Sketched Policy Updating with Imputed Rewards (SPUIR) in the (n + 1)-th episode |
| Open Source Code | No | The paper does not provide any specific links or statements indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | We empirically evaluated the performance of our algorithms on 3 datasets: the synthetic dataset, publicly available Criteo dataset6 (Criteo-recent, Criteo-all), and dataset collected from Tencent s We Chat app for coupon recommendation (commercial product). Footnote 6: https://labs.criteo.com/2013/12/conversion-logs-dataset/ |
| Dataset Splits | No | The paper mentions using different datasets for evaluation but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages or exact counts for each split). |
| Hardware Specification | Yes | We applied the algorithms to CBB setting and implemented on Intel(R) Xeon(R) Silver 4114 CPU@2.20GHz, and repeated the experiments 20 times. |
| Software Dependencies | No | The paper does not specify any software names with version numbers, such as programming languages, libraries, or frameworks used for implementation or experimentation. |
| Experiment Setup | Yes | According to Remark 4, we set the batch size as B = C2 BN/d, the constant CB 25, and the sketch size c = 150 on all the datasets. |