Off-Policy Evaluation for Action-Dependent Non-stationary Environments

Authors: Yash Chandak, Shiv Shankar, Nathaniel Bastian, Bruno da Silva, Emma Brunskill, Philip S. Thomas

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section presents both qualitative and quantitative empirical evaluations using several environments inspired by real-world applications that exhibit non-stationarity.
Researcher Affiliation Academia Yash Chandak University of Massachusetts Shiv Shankar University of Massachusetts Nathaniel D. Bastian United States Military Academy Bruno Castro da Silva University of Massachusetts Emma Brunskill Stanford University Philip S. Thomas University of Massachusetts
Pseudocode Yes A complete algorithm for the proposed procedure is provided in Appendix E.1.
Open Source Code Yes Code is available at https://github.com/yashchandak/active NS
Open Datasets Yes We use an open-source implementation [Xie, 2019] of the U.S. Food and Drug Administration (FDA) approved Type-1 Diabetes Mellitus simulator (T1DMS) [Man et al., 2014] for the treatment of Type-1 diabetes, where we induced non-stationarity by oscillating the body parameters (e.g., rate of glucose absorption, insulin sensitivity, etc.) between two known configurations available in the simulator. This induces passive non-stationarity, that is, changes are not dependent on past actions.
Dataset Splits Yes Due to space constraints, we defer the empirical results and discussion for this to Appendix E.5. 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments in the provided text.
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., libraries, frameworks, or programming languages) used in the experiments within the provided text.
Experiment Setup Yes Due to space constraints, we defer the empirical results and discussion for this to Appendix E.5. 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]