SELC: Self-Ensemble Label Correction Improves Learning with Noisy Labels
Authors: Yangdi Lu, Wenbo He
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct the experiments with class-conditional label noise on CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009]. Given these two datasets are initially clean, we follow [Patrini et al., 2017] to inject noise by label transition matrix Q, where Qij = Pr[ˆy = j | y = i] denotes the probability that noisy label ˆy is flipped from true label y. We evaluate SELC in two types of noise: symmetric and asymmetric. ... Table 1 shows the results on CIFAR with different types and levels of class-conditional label noise. |
| Researcher Affiliation | Academia | Yangdi Lu , Wenbo He Department of Computing and Software, Mc Master University, Canada {luy100, hew11}@mcmaster.ca |
| Pseudocode | Yes | We put pseudocode of SELC in Algorithm 1. Algorithm 1 SELC pseudocode. Input: DNNs f(Θ), training data ˆD = {(xi, ˆyi)}N i=1, Estimated turning point T, total epoch Tmax, hyperparameter α Output: Optimized DNN f(Θ ) 1: Let t = ˆy. 2: Select an initial epoch Te < T (e.g. Te = T 10). 3: while epoch e < Tmax do 4: if epoch e < Te then 5: Train f(Θ) by CE loss in Eq. (1) using SGD. 6: else 7: Update t by (5). 8: Train f(Θ) by SELC loss in Eq. (6) using SGD. 9: end if 10: end while |
| Open Source Code | Yes | The code is available at https://github.com/MacLLL/SELC. |
| Open Datasets | Yes | We conduct the experiments with class-conditional label noise on CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009]. ... We use ANIMAL-10N [Song et al., 2019], Clothing1M [Xiao et al., 2015] and Webvision [Li et al., 2017] to evaluate the performance of SELC under the real-world label noise settings. |
| Dataset Splits | No | Note that we do not perform early stopping since we don t assume the presence of clean validation data. |
| Hardware Specification | No | The paper does not provide specific hardware specifications (e.g., GPU model, CPU type) used for running the experiments. It only mentions using ResNet and other models. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA x.x). |
| Experiment Setup | Yes | We use the Res Net34 [He et al., 2016] as backbone for both datasets, and train the model using SGD with a momentum of 0.9, a weight decay of 0.001, and a batch size of 128. The network is trained for 200 epochs. We set the initial learning rate as 0.02, and reduce it by a factor of 10 after 40 and 80 epochs. We fix hyperparameter α = 0.9. |