Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity

Authors: Dixian Zhu, Yiming Ying, Tianbao Yang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our contributions include: (3) we demonstrate stable and competitive performance for the proposed adaptive LDR loss on 7 benchmark datasets under 6 noisy label and 1 clean settings against 13 loss functions, and on one real-world noisy dataset.
Researcher Affiliation Academia 1The University of Iowa, Iowa City, USA 2University at Albany, Albany, USA 3Texas A&M University, College Station, USA.
Pseudocode Yes Algorithm 1 Stochastic Optimization for ALDR-KL loss
Open Source Code Yes The method is open-sourced at https://github.com/ Optimization-AI/ICML2023_LDR.
Open Datasets Yes We conduct experiments on 7 benchmark datasets, namely ALOI, News20, Letter, Vowel (Fan & Lin), Kuzushiji-49, CIFAR-100 and Tiny-Image Net (Clanuwat et al., 2018; Deng et al., 2009). The statistics of the datasets are summarized in Table 5 in the Appendix.
Dataset Splits Yes For all the experiments unless specified otherwise, we manually add label noises to the training and validation data, but keep the testing data clean. We apply 5-fold-cross-validation to conduct the training and evaluation, and report the mean and standard deviation for the testing top-k accuracy, where k {1, 2, 3, 4, 5}.
Hardware Specification Yes Each entry stands for mean and standard deviation for 100 consecutive epochs running on a x86_64 GNU/Linux cluster with NVIDIA Ge Force GTX 1080 Ti GPU card.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used for replication.
Experiment Setup Yes We fix the weight decay as 5e-3, batch size as 64, and total running epochs as 100 for all the datasets except Kuzushiji, CIFAR100 and Tiny-Image Net (we run 30 epochs for them because the data sizes are large). We utilize the momentum optimizer with the initial learning rate tuned in {1e-1, 1e-2, 1e-3} for all experiments.