Learning with Noisy Labels Using Hyperspherical Margin Weighting

Authors: Shuo Zhang, Yuwen Li, Zhongyu Wang, Jianqing Li, Chengyu Liu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both benchmark and real-world datasets indicate that our HMW outperforms many state-of-the-art approaches in learning with noisy label tasks. In this section, we first illustrate the effectiveness of our HMW through various empirical understanding experiments. Then, the HMW is compared with some SOTA methods in this field on both benchmark datasets and real-world datasets. Besides, to discuss the hyperparameter settings of our strategy, some ablation studies are also conducted.
Researcher Affiliation Academia 1School of Instrument Science and Engineering, Southeast University, China 2State Key Laboratory of Digital Medical Engineering, Southeast University, China 3School of Biological Science and Medical Engineering, Southeast University, China {zs techo, liyuwen, zhongyu, ljq, chengyu}@seu.edu.cn
Pseudocode No The paper describes the methodology through text and mathematical equations, but it does not include a distinct pseudocode block or algorithm.
Open Source Code Yes Codes are available at https://github.com/Zhangshuojackpot/HMW.
Open Datasets Yes Experiments on both benchmark and real-world datasets indicate that our HMW outperforms many state-of-the-art approaches in learning with noisy label tasks. Codes are available at https://github.com/Zhangshuojackpot/HMW. Even accepted high-quality datasets, such as Image Net (Deng et al. 2009), include erroneous labels (Northcutt, Athalye, and Mueller 2021). Experiments on both benchmark and real-world datasets mentioning CIFAR-10, CIFAR-100, ANIMAL-10N, Webvision, ILSVRC12.
Dataset Splits Yes Bilevel learning (Jenni and Favaro 2018) used a bilevel optimization strategy to regularize the overfitting of a model using a clean validation dataset. we employ the first 50 categories of the Google image subset for training data and evaluate the performance on both the Webvision and ILSVRC12 validation sets. The related experimental settings are consistent with those in (Karim et al. 2022a).
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are mentioned in the paper.
Software Dependencies No The paper mentions software components like 'CCE', 'SGD optimizer', 'VGG19-BN backbone', and 'Inception-Res Net backbone', but it does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes As for the hyperparameter settings in the HMW, when using CCE and SCE for training, we set α = 100 and β = 100. While UNICON is applied for CIFAR-10 and the noise rate η ≤ 0.5, we set α = 1 and β = 1. When η > 0.5, we set α = 100 and β = 100... The batch size and the epoch are set to 128 and 200, respectively... The weight decay is set to 1 × 10−3 for ANIMAL-10N and 5 × 10−4 for Web Vision. The learning rate is set to 0.1 for ANIMAL-10N and 0.01 for Web Vision, respectively. Additionally, Random Crop, Random Horizontal Flip, and Cut Mix are picked as data augmentation strategies.