CUDA: Curriculum of Data Augmentation for Long-tailed Recognition

Authors: Sumyeong Ahn, Jongwoo Ko, Se-Young Yun

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present the results of experiments showing that CUDA effectively achieves better generalization performance compared to the state-of-the-art method on various imbalanced datasets such as CIFAR-100-LT, Image Net-LT, and i Naturalist 2018.
Researcher Affiliation Academia Sumyeong Ahn , Jongwoo Ko , Se-Young Yun KAIST AI Seoul, Korea {sumyeongahn, jongwoo.ko, yunseyoung}@kaist.ac.kr
Pseudocode Yes Algorithm 1: CUrriculum of Data Augmentation ... Algorithm 2: VLo L: Update Lo L score
Open Source Code Yes 1Code is available at Link
Open Datasets Yes We evaluate CUDA on the most commonly used long-tailed image classification tasks: CIFAR-100-LT (Cao et al., 2019), Image Net-LT (Liu et al., 2019), and a real-world benchmark, i Naturalist 2018 (Van Horn et al., 2018).
Dataset Splits No The paper frequently mentions 'validation accuracy' and 'validation set' (e.g., 'Validation accuracy on CIFAR-100-LT dataset' in Table 1), implying its use. However, it does not explicitly provide the specific percentages or counts for the training, validation, and test splits, or reference a specific standard split with attribution for reproduction. It mentions 'following the default settings of Cao et al. (2019)' which is too vague for exact reproducibility of splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments. It only mentions the backbone network architectures used (e.g., 'Res Net-32', 'Res Net-50').
Software Dependencies No The paper mentions using 'Res Net-32' and 'SGD' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or CUDA versions).
Experiment Setup Yes The network is trained on SGD with a momentum of 0.9 and a weight decay of 2 10 4. The initial learning rate is 0.1 and a linear learning rate warm-up is used in the first 5 epochs to reach the initial learning rate. During training over 200 epochs, the learning rate is decayed at the 160th and 180th epochs by 0.01. For the Image Net-LT and i Naturalist, the Res Net-50 is used as a backbone network and is trained for 100 epochs. The learning rate is decayed at the 60th and 80th epochs by 0.1. As with CIFAR, for c RT, RIDE, and BCL, we follow the original experimental settings of the official released code. For the hyperparameter values of CUDA, we apply a paug of 0.5 and T of 10 for all experiments. For γ, we set the values as 0.6 for CIFAR-100-LT and 0.4 for Image Net-LT and i Naturalist 2018.