Uncertainty-Aware Instance Reweighting for Off-Policy Learning
Authors: Xiaoying Zhang, Junpu Chen, Hongning Wang, Hong Xie, Yang Liu, John C.S. Lui, Hang Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results on the synthetic and real-world recommendation datasets demonstrate that UIPS significantly improves the quality of the discovered policy, when compared against an extensive list of state-of-the-art baselines. |
| Researcher Affiliation | Collaboration | 1Byte Dance Research 2Chong Qing University 3Tsinghua University 4 Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Science 5 The Chinese University of Hong Kong |
| Pseudocode | Yes | Algorithm 1 UIPS (found in Appendix 7.1) |
| Open Source Code | Yes | All data and code can be found in https://github.com/Xiaoyinggit/UIPS.git. |
| Open Datasets | Yes | We evaluate UIPS on both synthetic data and three real-world datasets with unbiased collection... Yahoo! R31; (2) Coat2; (3) Kuai Rec [12]... The Wiki10-31K dataset contains approximately 20K samples. |
| Dataset Splits | Yes | We split the dataset into train, validation and test sets with size 11K:3K:6K. (synthetic data) ... a small part of unbiased data split for validation purpose (5% on Yahoo R3 and Coat, and 15% on Kuai Rec). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions using neural networks and logistic regression but does not provide specific software dependencies or their version numbers. |
| Experiment Setup | Yes | the learning rate was searched in {1e 5, 1e 4, 1e 3, 1e 2}; λ, γ, η1 were searched in {0.5, 0.1, 1, 2,5, 10, 15, 20, 25, 30, 40, 50}. And η2 was searched in {1, 10, 100, 1000}. |