Predictive Performance Comparison of Decision Policies Under Confounding
Authors: Luke Guerdan, Amanda Lee Coston, Ken Holstein, Steven Wu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our framework theoretically and via synthetic data experiments. We conclude with a real-world application using our framework to support a pre-deployment evaluation of a proposed modification to a healthcare enrollment policy. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University. 2Microsoft Research. Correspondence to: Luke Guerdan <lguerdan@cs.cmu.edu>. |
| Pseudocode | Yes | Algorithm 1 Plug-in regret bound estimator |
| Open Source Code | Yes | All code for experiments is publicly available here. |
| Open Datasets | Yes | We leverage data released by Obermeyer et al. (2019) to construct an enrollment policy comparison task. This dataset contains 48, 000 records, where each entry consists of a patient evaluated for enrollment in a high-risk care management program. |
| Dataset Splits | Yes | We split the data into K disjoint folds, where we denote Ok, O k, as the sample inside and outside of fold k, respectively. We then define the plug-in estimator over Ok as... We recover regret estimates at full data efficiency via a cross-fitting approach outlined in Algorithm 1. ... We report results over N = 20 trials with K = 2 folds. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'parametric logistic regression models' and a 'logistic regression classifier' but does not specify version numbers for any software libraries or dependencies (e.g., Python, scikit-learn, PyTorch versions). |
| Experiment Setup | Yes | We use a linear regression to predict patient cost, and threshold predictions at the 55th percentile. This cutoff matches the threshold for physician enrollment recommendations of the deployed risk assessment (Obermeyer et al., 2019). We use Algorithm 1 to estimate [ ˆRδ(π, π0; m, ˆV), ˆRδ(π, π0; m, ˆV)] under the MSM with with Λ = 1.2. We report results over N = 20 trials with K = 2 folds. |