Estimating and Explaining Model Performance When Both Covariates and Labels Shift
Authors: Lingjiao Chen, Matei Zaharia, James Y. Zou
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on several real-world datasets with various ML models. Across different datasets and distribution shifts, SEES achieves significant (up to an order of magnitude) shift estimation error improvements over existing approaches. |
| Researcher Affiliation | Academia | Lingjiao Chen1, Matei Zaharia1, James Zou1,2 1Department of Computer Sciences, 2Department of Biomedical Data Science Stanford University |
| Pseudocode | No | The paper describes the algorithmic framework SEES and its components but does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | We also release our code1. 1https://github.com/stanford-futuredata/Sparse Joint Shift |
| Open Datasets | Yes | Six datasets are used for evaluation purposes. we first simulate various SJS on BANKCHURN [1], COVID-19 [2], and CREDIT [39] to systematically understand the performance of SEES. Next, we apply SEES on EMPLOY, INCOME, and INSURANCE [11] with real world distribution shifts and perform an in-depth analysis. |
| Dataset Splits | No | The paper states 'Same number of samples are allocated to both source and target datasets.' but does not specify standard training, validation, or test splits by percentages or absolute counts for the experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. It states 'Please see Appendix D' for compute resources, but Appendix D is not provided in the given text. |
| Software Dependencies | No | The paper mentions using 'a gradient boosting tree model' but does not specify the software package (e.g., XGBoost, LightGBM) or its version, nor does it list other software dependencies with version numbers (e.g., Python, PyTorch versions). |
| Experiment Setup | No | The paper states 'More details on the experiments can be found in the Appendix.' and 'D. Training Details' but does not include specific hyperparameters (e.g., learning rate, batch size) or other training configurations in the provided main text. |