reproducibilityindex.ai

Deeply-Debiased Off-Policy Interval Estimation

Authors: Chengchun Shi, Runzhe Wan, Victor Chernozhukov, Rui Song

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method is justified by theoretical results and numerical experiments. A Python implementation of the proposed procedure is available at https: //github.com/Runzhe Stat/D2OPE. In this section, we evaluate the empirical performance of our method using two synthetic datasets
Researcher Affiliation	Academia	1Department of Statistics, London School of Economics and Political Science, London, United Kingdom 2Department of Statistics, North Carolina State University, Raleigh, USA 3Department of Economics, Massachusetts Institute of Technology, Cambridge, USA. Correspondence to: Rui Song <rsong@ncsu.edu>.
Pseudocode	No	The paper describes its procedure in numbered steps but does not provide a formal pseudocode block or algorithm listing.
Open Source Code	Yes	A Python implementation of the proposed procedure is available at https: //github.com/Runzhe Stat/D2OPE.
Open Datasets	Yes	Take the Ohio T1DM dataset (Marling & Bunescu, 2018) as an example, only a few thousands observations are available (Shi et al., 2020b). ... using two synthetic datasets: Cart Pole from the Open AI Gym environment (Brockman et al., 2016) and a simulation environment (referred to as Diabetes) to simulate the Ohio T1DM data (Shi et al., 2020b).
Dataset Splits	Yes	Step 1. Data Splitting. We randomly divide the indices of all trajectories {1, , n} into K >= 2 disjoint subsets. Denote the kth subset by Ik and let Ic k = {1, , n} \ Ik. Data splitting allows us to use one part of data (Ic k) to train RL models and the remaining part (Ik) to do the estimation of the main parameter, i.e., ηπ.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions 'A Python implementation' and using 'random forest' but does not provide specific version numbers for Python, random forest libraries, or any other software dependencies.
Experiment Setup	Yes	We set T = 300 and γ = 0.98 for Cart Pole, and T = 200 and γ = 0.95 for Diabetes. For both environments, we vary the number of trajectories n and the temperature τ to design different settings. Results are aggregated over 200 replications.