reproducibilityindex.ai

Predictive Performance Comparison of Decision Policies Under Confounding

Authors: Luke Guerdan, Amanda Lee Coston, Ken Holstein, Steven Wu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our framework theoretically and via synthetic data experiments. We conclude with a real-world application using our framework to support a pre-deployment evaluation of a proposed modification to a healthcare enrollment policy.
Researcher Affiliation	Collaboration	1Carnegie Mellon University. 2Microsoft Research. Correspondence to: Luke Guerdan <lguerdan@cs.cmu.edu>.
Pseudocode	Yes	Algorithm 1 Plug-in regret bound estimator
Open Source Code	Yes	All code for experiments is publicly available here.
Open Datasets	Yes	We leverage data released by Obermeyer et al. (2019) to construct an enrollment policy comparison task. This dataset contains 48, 000 records, where each entry consists of a patient evaluated for enrollment in a high-risk care management program.
Dataset Splits	Yes	We split the data into K disjoint folds, where we denote Ok, O k, as the sample inside and outside of fold k, respectively. We then define the plug-in estimator over Ok as... We recover regret estimates at full data efficiency via a cross-fitting approach outlined in Algorithm 1. ... We report results over N = 20 trials with K = 2 folds.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions using 'parametric logistic regression models' and a 'logistic regression classifier' but does not specify version numbers for any software libraries or dependencies (e.g., Python, scikit-learn, PyTorch versions).
Experiment Setup	Yes	We use a linear regression to predict patient cost, and threshold predictions at the 55th percentile. This cutoff matches the threshold for physician enrollment recommendations of the deployed risk assessment (Obermeyer et al., 2019). We use Algorithm 1 to estimate [ ˆRδ(π, π0; m, ˆV), ˆRδ(π, π0; m, ˆV)] under the MSM with with Λ = 1.2. We report results over N = 20 trials with K = 2 folds.