Regroup Median Loss for Combating Label Noise
Authors: Fengpeng Li, Kemou Li, Jinyu Tian, Jiantao Zhou
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Compared to state-of-the-art methods, for both the traditionally trained and semi-supervised models, RML achieves a significant improvement on synthetic and complex real-world datasets. The source code is available at https://github.com/Feng-peng-Li/Regroup-Loss Median-to-Combat-Label-Noise. We perform experiments on synthetic datasets and real-world datasets. Tab. 1 shows the experimental comparisons on CIFAR-10 and CIFAR-100 without the semi-supervised strategy. RML increases the average test accuracy by about 1% on CIFAR-10 and by about 6% on CIFAR-100. |
| Researcher Affiliation | Academia | Fengpeng Li1, Kemou Li1, Jinyu Tian2, Jiantao Zhou1* 1State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, University of Macau 2Faculty of Innovation Engineering, Macau University of Science and Technology |
| Pseudocode | Yes | The pseudocode of the RML-based method is described in Alg. 1 of Appendix . The detailed procedure can be found in Alg. 2 of Appendix . |
| Open Source Code | Yes | The source code is available at https://github.com/Feng-peng-Li/Regroup-Loss Median-to-Combat-Label-Noise. |
| Open Datasets | Yes | For experiments on synthetic datasets, we choose two commonly used datasets CIFAR-10 and CIFAR-100 with different rates of symmetric label noise, pairflip label noise, and instance-dependent label noise (Xia et al. 2020). For the real-world datasets, we choose Clothing1M and Web Vision. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide percentages or counts for a separate validation split, nor does it cite a standard validation split. It uses the term |
| Hardware Specification | Yes | All our experiments are performed on Ubuntu 20.04.3 LTS workstations with Intel Xeon 5120 and 5 3090 by Py Torch. |
| Software Dependencies | No | The paper mentions |
| Experiment Setup | Yes | For the experiments on CIFAR-10, we set k to 60 for a symmetric label noise ratio of 0.8. For instance-dependent and symmetric label noise with 0.2 ratio, the k is 600. The remaining experiments on CIFAR-10 adopt k = 200. The experiments on CIFAR-100 use k = 50 when the noise rate is 0.2. For the experiments on CIFAR-100 with a noise rate of 0.8, k is 6. For the rest of the experiments, k is set to 20. In our experiments, λ is set to 0.999 according to (Tarvainen and Valpola 2017). To take into account the model performance and the convergence speed, we choose n = 6 on all datasets. |