SIGUA: Forgetting May Make Learning with Noisy Labels More Robust

Authors: Bo Han, Gang Niu, Xingrui Yu, Quanming Yao, Miao Xu, Ivor Tsang, Masashi Sugiyama

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that SIGUA successfully robustifies two typical base learning methods, so that their performance is often significantly improved.
Researcher Affiliation Collaboration 1Hong Kong Baptist University 2RIKEN 3AAII, University of Technology Sydney 44Paradigm Inc. (Hong Kong) 5University of Queensland 6The University of Tokyo.
Pseudocode Yes Algorithm 1 SIGUA-prototype (in a mini-batch).
Open Source Code No The paper does not provide any concrete access information, such as a repository link or an explicit statement of code release, for the methodology described.
Open Datasets Yes We verify the effectiveness of SIGUASL and SIGUABC on noisy MNIST, CIFAR-10, CIFAR-100 and NEWS following Han et al. (2018b).
Dataset Splits No The paper mentions 'validation data' for tuning hyperparameters and discusses 'Test Accuracy', but it does not specify explicit train/validation/test dataset splits (e.g., exact percentages or sample counts).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies No The paper mentions using 'Py Torch' as a deep learning framework but does not specify its version number or any other software dependencies with version details.
Experiment Setup Yes In SET1, O is Adam (Kingma & Ba, 2015) in its default,3 and the number of epochs is 200 with batch size nb as 128; the learning rate is linearly decayed to 0 from epoch 80 to 200. We set γ = 0.01 for all cases, except that γ = 0.001 for pair-45% of MNIST. ... SET2 is a bit complicated: for MNIST, O is Adam with betas as (0.9, 0.1), and lr is divided by 10 every 10 epochs; for CIFAR-10, O is SGD with momentum as 0.9, and lr is divided by 10 every 20 epochs; other hyperparameters have the same values as in SET1. We simply set γ = 1.0 for all cases.