A hierarchical decomposition for explaining ML performance discrepancies
Authors: Harvineet Singh, Fan Xia, Adarsh Subbaswamy, Alexej Gossmann, Jean Feng
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the utility of our framework in real-world examples of prediction models for hospital readmission and insurance coverage. Code for reproducing experiments is available at https://github.com/jjfeng/HDPD. |
| Researcher Affiliation | Collaboration | Harvineet Singh1 Fan Xia1 Adarsh Subbaswamy2 Alexej Gossmann2 Jean Feng1 1University of California, San Francisco 2U.S. Food and Drug Administration, Center for Devices and Radiological Health |
| Pseudocode | Yes | Algorithm 1 Aggregate decompositions into baseline, conditional covariate, and conditional outcome shifts; Algorithm 2 VALUECONDITIONALOUTCOME(S): Value for s-partial conditional outcome shift for a subset s; Algorithm 3 VALUECONDITIONALCOVARIATE(S): Value for s-partial conditional covariate shift for a subset s; Algorithm 4 Detailed decomposition for conditional outcome and covariate shift |
| Open Source Code | Yes | Code for reproducing experiments is available at https://github.com/jjfeng/HDPD. |
| Open Datasets | Yes | We analyze a neural network trained to predict whether a person has public health insurance using data from Nebraska in the American Community Survey (source, n = 3000), applied to data from Louisiana (target, n = 6000). |
| Dataset Splits | Yes | Let the data be randomly split into training and evaluation partitions. ... We fit all models on 80% of the data points from both source and target datasets which is the Tr partition, and keep the remaining 20% for computing the estimators which is the Ev partition. |
| Hardware Specification | Yes | All experiments are run on a 2.60 GHz processor with 8 CPU cores. |
| Software Dependencies | No | The paper mentions using 'scikit-learn implementations' but does not specify version numbers for any software dependencies like scikit-learn, Python, or other libraries. |
| Experiment Setup | Yes | We use 3-fold cross validation to select models. ... We clip the predicted probabilities from the density model for π at 10 6 to avoid very large density weights. ... Specific hyperparameter ranges for the grid search are provided in the code. |