reproducibilityindex.ai

Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions

Authors: Omer Gottesman, Joseph Futoma, Yao Liu, Sonali Parbhoo, Leo Celi, Emma Brunskill, Finale Doshi-Velez

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on medical simulations and real-world intensive care unit data demonstrate that our method can be used to identify limitations in the evaluation process and make evaluation more robust.
Researcher Affiliation	Academia	1Harvard University 2Stanford University 3MIT.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Code for reproducing the results in this paper can be found at https://github.com/dtak/interpretable_ope_public.git
Open Datasets	Yes	Our data source is a subset of the publicly available MIMIC-III dataset (Johnson et al., 2016).
Dataset Splits	No	Our final dataset consists of 346 patient trajectories (6777 transitions) for learning a policy and another 346 trajectories (6863 transitions) for evaluation of the policy via OPE and inﬂuence analysis.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	In all ﬁgures, we highlight in red all inﬂuential transitions our method would have highlighted for review by domain experts ( Ic = 0.05). As an evaluation policy, we use the most common action of a state s 50 nearest neighbors.