Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Authors: Ziyang Tang*, Yihao Feng*, Lihong Li, Dengyong Zhou, Qiang Liu
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Both theoretical and empirical results show that our method yields significant advantages over previous methods. |
| Researcher Affiliation | Collaboration | Ziyang Tang * The University of Texas at Austin EMAIL Yihao Feng The University of Texas at Austin EMAIL Lihong Li Google Research EMAIL Dengyong Zhou Google Research EMAIL Qiang Liu The University of Texas at Austin EMAIL |
| Pseudocode | Yes | Algorithm 1 Infinite Horizon Doubly Robust Estimator |
| Open Source Code | No | The paper mentions using 'open source implementation' (footnote 2) for deep Q-learning, which points to a third-party repository. It also provides a link for 'additional experimental results' (footnote 3) but does not contain an unambiguous statement of releasing the specific code for their *own* methodology. |
| Open Datasets | Yes | Taxi Environment We follow Liu et al. (2018a) s tabular environment Taxi |
| Dataset Splits | No | The paper mentions using 'a set of independent sample to first train a value function b V and a density function bρ' and 'a seperate training dataset with 200 trajectories whose horizon length is 1000', but does not provide specific numerical splits (e.g., percentages or counts) for training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Open AI Gym' and 'Adam Optimizer' but does not provide specific version numbers for any software dependencies required to replicate the experiments. |
| Experiment Setup | Yes | For more experimental details, please check appendix C.1. |