Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

High-Confidence Off-Policy (or Counterfactual) Variance Estimation

Authors: Yash Chandak, Shiv Shankar, Philip S. Thomas6939-6947

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental Study Inspired by real-world applications where OVE and HCOVE can be useful, we validate our proposed estimators empirically on two domains motivated by real-world applications. Here, we only provide a brief description about the experimental setup and the main results. Appendix G contains additional experimental details. Figure 3: Experimental results using 100 trials.
Researcher Affiliation	Academia	Yash Chandak, Shiv Shankar, Philip S. Thomas University of Massachusetts EMAIL
Pseudocode	Yes	Algorithm 1: Variance-Reduced Off-Policy Variance Estimator
Open Source Code	No	The paper cites an “open-source implementation (Xie 2019) of the FDA approved Type-1 Diabetes Mellitus simulator (T1DMS)”, but does not provide a link or statement about the authors’ own source code for the methodology described in the paper.
Open Datasets	Yes	Diabetes treatment: This domain is based on an opensource implementation (Xie 2019) of the FDA approved Type-1 Diabetes Mellitus simulator (T1DMS) (Man et al. 2014) for treatment of Type-1 Diabetes... Gridworld: We also consider a standard 4 4 Gridworld with stochastic transitions.
Dataset Splits	No	The paper mentions using trajectories and varying the number of trajectories in its experiments, but it does not provide specific details on how these trajectories were split into training, validation, or test sets, nor does it refer to predefined splits with citations.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware used to conduct the experiments, such as CPU or GPU models, memory, or cloud computing specifications.
Software Dependencies	Yes	Simglucose v0.2.1 (2018). URL https://github. com/jxx123/simglucose.
Experiment Setup	No	The paper includes an “Experimental Study” section that describes the domains used but lacks specific details regarding hyperparameters (e.g., learning rate, batch size) or system-level training settings. It refers to “Appendix G” for additional details, but these are not present in the main text.