SELC: Self-Ensemble Label Correction Improves Learning with Noisy Labels

Authors: Yangdi Lu, Wenbo He

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct the experiments with class-conditional label noise on CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009]. Given these two datasets are initially clean, we follow [Patrini et al., 2017] to inject noise by label transition matrix Q, where Qij = Pr[ˆy = j | y = i] denotes the probability that noisy label ˆy is flipped from true label y. We evaluate SELC in two types of noise: symmetric and asymmetric. ... Table 1 shows the results on CIFAR with different types and levels of class-conditional label noise.
Researcher Affiliation Academia Yangdi Lu , Wenbo He Department of Computing and Software, Mc Master University, Canada {luy100, hew11}@mcmaster.ca
Pseudocode Yes We put pseudocode of SELC in Algorithm 1. Algorithm 1 SELC pseudocode. Input: DNNs f(Θ), training data ˆD = {(xi, ˆyi)}N i=1, Estimated turning point T, total epoch Tmax, hyperparameter α Output: Optimized DNN f(Θ ) 1: Let t = ˆy. 2: Select an initial epoch Te < T (e.g. Te = T 10). 3: while epoch e < Tmax do 4: if epoch e < Te then 5: Train f(Θ) by CE loss in Eq. (1) using SGD. 6: else 7: Update t by (5). 8: Train f(Θ) by SELC loss in Eq. (6) using SGD. 9: end if 10: end while
Open Source Code Yes The code is available at https://github.com/MacLLL/SELC.
Open Datasets Yes We conduct the experiments with class-conditional label noise on CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009]. ... We use ANIMAL-10N [Song et al., 2019], Clothing1M [Xiao et al., 2015] and Webvision [Li et al., 2017] to evaluate the performance of SELC under the real-world label noise settings.
Dataset Splits No Note that we do not perform early stopping since we don t assume the presence of clean validation data.
Hardware Specification No The paper does not provide specific hardware specifications (e.g., GPU model, CPU type) used for running the experiments. It only mentions using ResNet and other models.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA x.x).
Experiment Setup Yes We use the Res Net34 [He et al., 2016] as backbone for both datasets, and train the model using SGD with a momentum of 0.9, a weight decay of 0.001, and a batch size of 128. The network is trained for 200 epochs. We set the initial learning rate as 0.02, and reduce it by a factor of 10 after 40 and 80 epochs. We fix hyperparameter α = 0.9.