Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Off-Policy Evaluation for Action-Dependent Non-stationary Environments

Authors: Yash Chandak, Shiv Shankar, Nathaniel Bastian, Bruno da Silva, Emma Brunskill, Philip S. Thomas

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section presents both qualitative and quantitative empirical evaluations using several environments inspired by real-world applications that exhibit non-stationarity.
Researcher Affiliation	Academia	Yash Chandak University of Massachusetts Shiv Shankar University of Massachusetts Nathaniel D. Bastian United States Military Academy Bruno Castro da Silva University of Massachusetts Emma Brunskill Stanford University Philip S. Thomas University of Massachusetts
Pseudocode	Yes	A complete algorithm for the proposed procedure is provided in Appendix E.1.
Open Source Code	Yes	Code is available at https://github.com/yashchandak/active NS
Open Datasets	Yes	We use an open-source implementation [Xie, 2019] of the U.S. Food and Drug Administration (FDA) approved Type-1 Diabetes Mellitus simulator (T1DMS) [Man et al., 2014] for the treatment of Type-1 diabetes, where we induced non-stationarity by oscillating the body parameters (e.g., rate of glucose absorption, insulin sensitivity, etc.) between two known conﬁgurations available in the simulator. This induces passive non-stationarity, that is, changes are not dependent on past actions.
Dataset Splits	Yes	Due to space constraints, we defer the empirical results and discussion for this to Appendix E.5. 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments in the provided text.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies (e.g., libraries, frameworks, or programming languages) used in the experiments within the provided text.
Experiment Setup	Yes	Due to space constraints, we defer the empirical results and discussion for this to Appendix E.5. 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]