VITA: A Multi-Source Vicinal Transfer Augmentation Method for Out-of-Distribution Generalization
Authors: Minghui Chen, Cheng Wen, Feng Zheng, Fengxiang He, Ling Shao321-329
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed VITA significantly outperforms the current state-of-the-art augmentation methods, demonstrated in extensive experiments on corruption benchmarks. |
| Researcher Affiliation | Collaboration | Minghui Chen1, Cheng Wen2, Feng Zheng1*, Fengxiang He3, Ling Shao4 1 Department of Computer Science and Engineering, Southern University of Science and Technology 2 The University of Sydney 3 JD Explore Academy, JD.com Inc 4 National Center for Artificial Intelligence, Saudi Data and Artificial Intelligence Authority, Riyadh, Saudi Arabia |
| Pseudocode | No | The paper describes methods in text but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link for the open-source code of the proposed VITA method. |
| Open Datasets | Yes | Dataset. CIFAR-10 (10 categories) and CIFAR-100 (100 categories) both contain small 32 32 3 colour images, with 50k for training and 10k for testing. The Image Net (Deng et al. 2009) dataset includes 1,000 classes and contains approximately 1.2 million images annotated according to the Word Net hierarchy. To evaluate the corruption robustness of models, we conduct experiments on the CIFAR-10C, CIFAR-100-C and Image Net-C datasets (Hendrycks and Dietterich 2019). |
| Dataset Splits | Yes | CIFAR-10 (10 categories) and CIFAR-100 (100 categories) both contain small 32 32 3 colour images, with 50k for training and 10k for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. Vague terms like 'All Conv Net' refer to network architectures, not hardware. |
| Software Dependencies | No | The paper mentions using specific network architectures (e.g., All Convolutional Network, Dense Net-BC, Wide Res Net, Res Ne Xt-29) and optimizers (stochastic gradient descent), but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We use stochastic gradient descent with an initial learning rate of 0.1 and Reduce On Plateau scheduler. We train all architectures over 150 epochs. Models are trained using SGD with 0.9 momenta for 100 epochs, with the initial learning rate of 0.01 divided by ten at epoch 60. |