An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Authors: Yang Xu, Jin Zhu, Chengchun Shi, Shikai Luo, Rui Song
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we propose a number of policy value estimators and illustrate their effectiveness through extensive simulations and real data analysis from a worldleading short-video platform. In this section, we will conduct detailed comparisons between our estimator and other state-of-the-art methods for OPE estimation under MDPs via synthetic data (Section 7.1) and real-world data (Section 7.2). |
| Researcher Affiliation | Collaboration | 1Department of Statistics, North Carolina State University, Raleigh, USA 2Sun Yat-sen University, China 3Byte Dance, China 4Department of Statistics, London School of Economics and Political Science, London, UK. |
| Pseudocode | Yes | Algorithm 1 Model Selection for IV-based confounded OPE |
| Open Source Code | Yes | The source code is available on github: https://github.com/YangXU63/IVMDP. |
| Open Datasets | No | The paper mentions using "synthetic data" and a "real dataset from a world-leading technological company" for which they "generate a synthetic data environment based on the real data due to privacy considerations". No specific link, DOI, or formal citation is provided for public access to either the real or the generated synthetic dataset. |
| Dataset Splits | No | The paper describes the data generating process for simulations but does not specify how the data is split into training, validation, or test sets. It mentions "N = 1000 trajectories, each with T = 100 time points" but not the partitioning for model evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow). |
| Experiment Setup | No | The paper describes the data generation process and some parameters for a toy example (e.g., N, T, shift parameters alpha and beta). However, it does not provide specific hyperparameters or system-level training settings (e.g., learning rate, batch size, optimizer details, number of epochs) for the estimators described in the paper. |