Deep AutoAugment
Authors: Yu Zheng, Zhi Zhang, Shen Yan, Mi Zhang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that even without default augmentations, we can learn an augmentation policy that achieves strong performance with that of previous works. Extensive ablation studies show that the regularized gradient matching is an effective search method for data augmentation policies. Our code is available at: https://github.com/MSU-MLSys-Lab/Deep AA. |
| Researcher Affiliation | Collaboration | Yu Zheng1, Zhi Zhang2, Shen Yan1, Mi Zhang1 1Michigan State University, 2Amazon Web Services |
| Pseudocode | No | The paper includes mathematical equations for its formulation but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at: https://github.com/MSU-MLSys-Lab/Deep AA. |
| Open Datasets | Yes | We evaluate the performance of Deep AA on three datasets CIFAR-10, CIFAR-100, and Image Net and compare it with existing automated data augmentation search methods |
| Dataset Splits | No | The paper mentions training on 'a subset of 4,000 randomly selected samples from CIFAR-10' for policy search and constructing 'a validation batch by sampling a batch of original data from the validation set'. However, it does not explicitly state the specific train/validation/test dataset split percentages or sample counts for the overall experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or cloud computing instance specifications. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' and specific neural network architectures like 'Wide-Res Net-28-10' and 'Res Net-50', but it does not specify version numbers for any software libraries (e.g., PyTorch, TensorFlow, CUDA) or programming languages used. |
| Experiment Setup | Yes | We first train the network on a subset of 4, 000 randomly selected samples from CIFAR-10. We then progressively update the policy network parameters θk (k = 1, 2, , K) for 512 iterations for each of the K augmentation layers. We use the Adam optimizer (Kingma & Ba, 2015) and set the learning rate to 0.025 for policy updating. The evaluation configurations are kept consistent with that of Fast Auto Augment. ...we use step learning rate scheduler with a reduction factor of 0.1, and we train and evaluate with images of size 224x224. ...The best found parameters are summarized in Table 8 in Appendix. (Table 8 provides Learning Rate, Weight Decay, Epoch for CIFAR-10/100 with Batch Augmentation). |