Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits

Authors: Yi Shen, Pan Xu, Michael Zavlanos

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further validate our approach using a public dataset that was recorded in a randomized stoke trial. In our study, we validate the proposed methods using a randomized controlled trial dataset that study the effects of drug treatment on acute ischemic stroke, further demonstrating their effectiveness.
Researcher Affiliation Academia Yi Shen EMAIL Duke University Pan Xu EMAIL Duke University Michael M. Zavlanos EMAIL Duke University
Pseudocode Yes Algorithm 1 Policy learning using biased stochastic gradient descent
Open Source Code No The text does not contain an explicit statement about releasing code or a link to a code repository for the methodology described in this paper.
Open Datasets Yes We further validate our approach using a public dataset that was recorded in a randomized stoke trial. We validate our regularized Wasserstein DRO methods for both OPE and OPL problems on the International Stroke Trial (IST) (Group et al., 1997) dataset. The IST dataset (Sandercock et al., 2011) includes 19,435 patients...
Dataset Splits Yes To introduce distribution shifts, we split the dataset into a training set and a testing set, and we introduce a selection bias into the training set. Specifically, we randomly remove 50% of the patients in the training set who are not fully conscious.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions commercial linear programming solvers like Gurobi as an example for solving LPs, citing its reference manual from 2023, but it does not specify concrete software dependencies, including library names with version numbers, used for the reported experiments.
Experiment Setup Yes The decision trees parameters are cross-validated and shown in Table 4. Table 4: Decision trees parameters Parameters Action 1 Action 2 max depth 4 4 min samples leaf 5 2 score function mean squared error mean squared error