UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup

Authors: Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, Bingzhe Wu, Changqing Zhang, Jianhua Yao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Further, we conduct extensive empirical studies across a wide range of tasks to validate the effectiveness of our method both qualitatively and quantitatively.
Researcher Affiliation Collaboration 1College of Intelligence and Computing, Tianjin University, 2 Hong Kong University of Science and Technology, 3Tencent AI Lab
Pseudocode Yes Algorithm 1: The training pseudocode of UMIX. Algorithm 2: The process for obtaining training importance weights.
Open Source Code Yes Code is available at this URL. Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Code has been released.
Open Datasets Yes We perform experiments on three datasets with multiple subpopulations, including Waterbirds [58], Celeb A [43] and Civil Comments [9]. ... We conduct experiments on a medical dataset called Camelyon17 [5, 33]
Dataset Splits No Following prior works [41, 72], we assume the group labels of validation samples are available and select the best model based on worst-case accuracy among all subpopulations on the validation set. We also conduct model selection based on the average accuracy to show the impact of validation group label information in our method. The training data is drawn from three hospitals, while the validation and test data are sampled from other hospitals. While the paper mentions the use of a validation set and states that dataset split details are in the Appendix, the specific percentages or counts for the validation split are not provided in the main text.
Hardware Specification No The paper mentions that hardware specifications are in Appendix B, but no specific hardware details (e.g., GPU models, CPU types) are provided within the main text.
Software Dependencies No The paper mentions that training details are in Appendix B, but no specific software versions (e.g., Python 3.8, PyTorch 1.9) are provided within the main text.
Experiment Setup Yes The training pseudocode of UMIX. Input: Training dataset D and the corresponding importance weights w = [w1, , w N], hyperparameter σ to control the probability of doing UMIX, and parameter α of the beta distribution; The process for obtaining training importance weights. Input: Training dataset D, sampling start epoch Ts, the number of sampling T, and upweight hyperparameter η ;