C-Mixup: Improving Generalization in Regression

Authors: Huaxiu Yao, Yiping Wang, Linjun Zhang, James Y. Zou, Chelsea Finn

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate C-Mixup on eleven datasets, ranging from tabular to video data. Compared to the best prior approach, C-Mixup achieves 6.56%, 4.76%, 5.82% improvements in in-distribution generalization, task generalization, and out-of-distribution robustness, respectively.
Researcher Affiliation Academia 1Stanford University, 2Zhejiang University, 3Rutgers University
Pseudocode Yes Algorithm 1 Training with C-Mixup
Open Source Code Yes Code is released at https://github.com/huaxiuyao/C-Mixup.
Open Datasets Yes We use the following five datasets to evaluate the performance of in-distribution generalization (see Appendix C.1 for detailed data statistics). (1)&(2) Airfoil Self-Noise (Airfoil) and NO2 [35] are both are tabular datasets... (3)&(4): Exchange-Rate, and Electricity [40] are two time-series datasets... (5) Echocardiogram Videos (Echo) [50] is a ejection fraction prediction dataset...
Dataset Splits Yes In our experiments, we apply cross-validation to tune all hyperparameters with grid search. ... For Airfoil and NO2, we split the data into 80% training and 20% test sets, without a validation set, as common practice for these datasets. ... For Echo, we use the default training/validation/testing split provided by the dataset.
Hardware Specification Yes We train all models on a single Nvidia Tesla A100 GPU.
Software Dependencies No The paper mentions various models and architectures but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For all models, we use the Adam optimizer [40] with a learning rate of 1e-3 and a batch size of 256. We train for 200 epochs, and apply early stopping with patience 20. For mixup variants, we set α = 1.0. For C-Mixup, we set σ = 0.5.