Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations
Authors: Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, Yang Liu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We quantitatively and qualitatively show that real-world noisy labels follow an instance-dependent pattern rather than the classically assumed and adopted ones (e.g., class-dependent label noise). We then initiate an effort to benchmarking a subset of the existing solutions using CIFAR-10N and CIFAR-100N. We further proceed to study the memorization of correct and wrong predictions, which further illustrates the difference between human noise and class-dependent synthetic noise. |
| Researcher Affiliation | Academia | University of California, Santa Cruz, TML Lab, University of Sydney, RIKEN {jiahengwei,zwzhu,haocheng,yangliu}@ucsc.edu, tongliang.liu@sydney.edu.au, gang.niu.ml@gmail.com |
| Pseudocode | No | The paper details experimental procedures and refers to existing algorithms (e.g., Res Net-34, Co-teaching+), but does not present any pseudocode or clearly labeled algorithm blocks within its text. |
| Open Source Code | Yes | A starter code is provided in https://github.com/UCSC-REAL/cifar-10-100n. |
| Open Datasets | Yes | This work presents two new benchmark datasets, which we name as CIFAR-10N, CIFAR100N (jointly we call them CIFAR-N), equipping the training datasets of CIFAR-10 and CIFAR-100 with human-annotated real-world noisy labels we collected from Amazon Mechanical Turk. The corresponding datasets and the leaderboard are available at http://noisylabels.com. |
| Dataset Splits | Yes | CIFAR-10 (Krizhevsky et al., 2009) dataset contains 60k 32 32 color images, 50k images for training and 10k images for testing. CIFAR-100 (Krizhevsky et al., 2009) dataset contains 60K 32 32 color images of 100 fine classes, 50000 images for training and 10000 images for testing. |
| Hardware Specification | Yes | All our experiments run on a GPU cluster (500 GPUs of all kinds, mainly use 2080 Ti) for training and evaluation. |
| Software Dependencies | No | The paper mentions using a Res Net-34 model and SGD optimizer, but does not specify software versions for libraries like TensorFlow, PyTorch, or Python, which are necessary for full reproducibility. |
| Experiment Setup | Yes | The basic hyper-parameters settings for CIFAR-10N and CIFAR-100N are listed as follows: minibatch size (128), optimizer (SGD), initial learning rate (0.1), momentum (0.9), weight decay (0.0005), number of epochs (100) and learning rate decay (0.1 at 50 epochs). Standard data augmentation is applied to each dataset. |