PatchUp: A Feature-Space Block-Level Regularization Technique for Convolutional Neural Networks
Authors: Mojtaba Faramarzi, Mohammad Amini, Akilesh Badrinaaraayanan, Vikas Verma, Sarath Chandar589-597
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on CIFAR10/100, SVHN, Tiny-Image Net, and Image Net using Res Net architectures including Pre Act Resnet18/34, WRN28-10, Res Net101/152 models show that Patch Up improves upon, or equals, the performance of current state-of-the-art regularizers for CNNs. |
| Researcher Affiliation | Academia | 1Mila Quebec AI Institute 2University of Montreal 3Mc Gill University 4Aalto Univeristy, Finland 5 Ecole Polytechnique de Montr eal |
| Pseudocode | Yes | Patch Up operations are illustrated in Fig. 1 (see more details in Algorithm 1 in Appendix). |
| Open Source Code | Yes | 1The code is available: https://github.com/chandar-lab/PatchUp |
| Open Datasets | Yes | This section presents the results of applying Patch Up to image classification tasks using various benchmark datasets such as CIFAR10, CIFf AR100 (Krizhevsky, Hinton et al. 2009), SVHN (the standard version with 73257 training samples) (Netzer et al. 2011), Tiny-Image Net (Chrabaszcz, Loshchilov, and Hutter 2017) |
| Dataset Splits | No | The paper mentions using a validation set in Table 6 ('The validation error on CIFAR100'), but does not explicitly describe the split percentages or methodology for creating train/validation/test splits. |
| Hardware Specification | Yes | Patch Up and other methods reach the reported performance with WRN-28-10 model on Tiny Image Net after about 23 hours of training using one GPU (V100). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | The details of experimental setup and the hyper-parameter tuning are given in appendix-E. We set α to 2 in Patch Up. Patch Up has patchup prob, γ and block size as additional hyperparameters. patchup prob is the probability that Patch Up is performed for a given mini-batch. Based on our hyperparameter tuning, Hard Patch Up yields the best performance with patchup prob, γ, and block size as 0.7, 0.5, and 7, respectively. Soft Patch Up achieves the best performance with patchup prob, γ, and block size as 1.0, 0.75, and 7, respectively. |