Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits
Authors: Nian Si, Fan Zhang, Zhengyuan Zhou, Jose Blanchet
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Additionally, we provide extensive simulations to demonstrate the robustness of our policy. |
| Researcher Affiliation | Collaboration | 1Department of Management Science & Engineering, Stanford University 2IBM research and Stern School of Business, New York University. |
| Pseudocode | Yes | Algorithm 1 Distributionally Robust Policy Evaluation |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the release of source code. |
| Open Datasets | No | The paper describes a simulation environment for data generation but does not provide access information (link, DOI, citation) to a publicly available or open dataset. For example: 'The feature vectors Xi R10 are independently and uniformly drawn from [0, 1]10.' |
| Dataset Splits | Yes | We first test the convergence of different estimators for δ = 0.2 and three different sizes of dataset: n = 103, 104, 105. |
| Hardware Specification | No | No explicit hardware specifications (e.g., GPU/CPU models, memory details) were mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers were mentioned in the paper. |
| Experiment Setup | Yes | We fix δ = 0.1 and the size of training set is n = 3000, and the policy class is depth-3 trees. |