Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Authors: Omer Gottesman, Joseph Futoma, Yao Liu, Sonali Parbhoo, Leo Celi, Emma Brunskill, Finale Doshi-Velez
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on medical simulations and real-world intensive care unit data demonstrate that our method can be used to identify limitations in the evaluation process and make evaluation more robust. |
| Researcher Affiliation | Academia | 1Harvard University 2Stanford University 3MIT. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code for reproducing the results in this paper can be found at https://github.com/dtak/interpretable_ope_public.git |
| Open Datasets | Yes | Our data source is a subset of the publicly available MIMIC-III dataset (Johnson et al., 2016). |
| Dataset Splits | No | Our final dataset consists of 346 patient trajectories (6777 transitions) for learning a policy and another 346 trajectories (6863 transitions) for evaluation of the policy via OPE and influence analysis. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | In all figures, we highlight in red all influential transitions our method would have highlighted for review by domain experts ( Ic = 0.05). As an evaluation policy, we use the most common action of a state s 50 nearest neighbors. |