reproducibilityindex.ai

Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding

Authors: Hongseok Namkoong, Ramtin Keramati, Steve Yadlowsky, Emma Brunskill

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On simulated healthcare examples management of sepsis and interventions for autistic children where this is a reasonable model, we demonstrate that our method invalidates non-robust results and provides meaningful certiﬁcates of robustness, allowing reliable selection of policies under unobserved confounding.
Researcher Affiliation	Academia	Hongseok Namkoong Decision, Risk, and Operations Division Columbia Business School namkoong@gsb.columbia.edu Ramtin Keramati Computational and Mathematical Engineering Stanford University keramati@cs.stanford.edu Steve Yadlowsky Electrical Engineering Stanford University syadlows@stanford.edu Emma Brunskill Computer Science Stanford University ebrun@cs.stanford.edu
Pseudocode	No	The paper includes mathematical formulations and theorems but does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Our code is publicly available at https://github.com/Stanford AI4HI/off_policy_confounding.git
Open Datasets	Yes	Using the sepsis simulator developed by Oberst and Sontag [38], we consider a scenario where automated policies have been proposed, and we wish to evaluate their beneﬁts. Using a simulator for autistic children developed by Lu et al. [31], which models the data from a (real) sequential randomized trial (SMART) [23], we compare different approaches for improving the number of speech utterances.
Dataset Splits	No	The paper mentions data generation from simulators and evaluation but does not specify explicit train/validation/test dataset splits with percentages or sample counts for the data used in experiments.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies or their version numbers required for replication of the experiments.
Experiment Setup	Yes	To simulate unrecorded comorbidities that could introduce confounding, we simulate an unobserved confounder associated with favorable state transitions. At t = 1, we take the optimal action with respect to all other options (vasopressors and mechanical ventilation), and administer antibiotics with probability 1/(1+Gamma) if the confounding variable is large, and with probability 1/(1+Gamma) if the confounding variable is small. This satisfies Assumption F with level Gamma. For t >= 2, the behavior policy takes the optimal next treatment action with probability 0.85, and otherwise switches the vasopressor status, independent of the confounders.