Off-Policy Evaluation for Action-Dependent Non-stationary Environments
Authors: Yash Chandak, Shiv Shankar, Nathaniel Bastian, Bruno da Silva, Emma Brunskill, Philip S. Thomas
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section presents both qualitative and quantitative empirical evaluations using several environments inspired by real-world applications that exhibit non-stationarity. |
| Researcher Affiliation | Academia | Yash Chandak University of Massachusetts Shiv Shankar University of Massachusetts Nathaniel D. Bastian United States Military Academy Bruno Castro da Silva University of Massachusetts Emma Brunskill Stanford University Philip S. Thomas University of Massachusetts |
| Pseudocode | Yes | A complete algorithm for the proposed procedure is provided in Appendix E.1. |
| Open Source Code | Yes | Code is available at https://github.com/yashchandak/active NS |
| Open Datasets | Yes | We use an open-source implementation [Xie, 2019] of the U.S. Food and Drug Administration (FDA) approved Type-1 Diabetes Mellitus simulator (T1DMS) [Man et al., 2014] for the treatment of Type-1 diabetes, where we induced non-stationarity by oscillating the body parameters (e.g., rate of glucose absorption, insulin sensitivity, etc.) between two known configurations available in the simulator. This induces passive non-stationarity, that is, changes are not dependent on past actions. |
| Dataset Splits | Yes | Due to space constraints, we defer the empirical results and discussion for this to Appendix E.5. 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments in the provided text. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., libraries, frameworks, or programming languages) used in the experiments within the provided text. |
| Experiment Setup | Yes | Due to space constraints, we defer the empirical results and discussion for this to Appendix E.5. 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] |