reproducibilityindex.ai

Early Stopping Against Label Noise Without Validation Data

Authors: Suqin Yuan, Lei Feng, Tongliang Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we show both the effectiveness of the Label Wave method across various settings and its capability to enhance the performance of existing methods for learning with noisy labels.
Researcher Affiliation	Academia	1 Sydney AI Centre, School of Computer Science, The University of Sydney 2 School of Computer Science and Engineering, Nanyang Technological University
Pseudocode	Yes	Algorithm 1 Label Wave Let θo be the initial parameters and v be the local minimum of PC. Let p be the Patience , representing the number of times a worsening PC is observed before halting. θ θo, t 0, i 0, v 1: while i < p do 2: Update θ by running the training for n steps, and t t + n. 3: PCt Compute prediction changes (PC) in step t. 4: PC t Moving Averages PC in recent k steps. 5: if PC t < v then 6: v PC t ; i 0, θ θ, t t // Models stored at every new local minimum. 7: else 8: i i + 1 // Counting Patience when PC t is larger than local minimum. 9: end if 10: end while=0 Best parameters are θ , and best number of training steps is t .
Open Source Code	No	The paper does not contain any explicit statement about releasing code or a link to a code repository for the Label Wave method.
Open Datasets	Yes	These datasets comprise seven vision-oriented sets: CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), CIFAR-N (Wei et al., 2021), Clothing1M (Xiao et al., 2015), Web Vision (Li et al., 2017), Food101 (Bossard et al., 2014), and Tiny-Image Net (Le & Yang, 2015), along with a text-oriented dataset: NEWS (Kiryo et al., 2017; Yu et al., 2019).
Dataset Splits	Yes	The CIFAR-10 dataset, which is accessible via the torchvision.datasets module. 20% of the training data is held out for validation during the training process.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	Yes	Framework: Py Torch, Version 1.11.0.
Experiment Setup	Yes	Batch Size: 128 [...] Learning Rate: Fixed at 0.01. [...] Optimizer: Employs optim.SGD with momentum = 0.9. [...] By adjusting the batch sizes to 64, 128, 256, learning rates to 0.01, 0.005, 0.001, random seeds to 1, 2, 3, 4, 5, and employing different optimizers such as SGD with momentum (Robbins & Monro, 1951; Polyak, 1964), RMSprop (Tieleman et al., 2012), and Adam (Kingma & Ba, 2014)