Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels
Authors: Pengfei Chen, Junjie Ye, Guangyong Chen, Jingwei Zhao, Pheng-Ann Heng11451-11461
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our theoretical results and additional claims with extensive experiments. We show characterizations of models trained with noisy labels, motivated by our theoretical results, and verify the utility of a noisy validation set by showing the impressive performance of a framework termed noisy best teacher and student (NTS). Our code is released. |
| Researcher Affiliation | Collaboration | Pengfei Chen,1 Junjie Ye,2 Guangyong Chen,3 Jingwei Zhao,2 Pheng-Ann Heng1,3 1 The Chinese University of Hong Kong 2 VIVO AI Lab 3 Guangdong Provincial Key Laboratory of Computer Vision and Virtual Reality Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences |
| Pseudocode | No | The paper describes the NTS framework steps in text but does not provide a formally labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Our code is released1. 1https://github.com/chenpf1025/Robustness Accuracy |
| Open Datasets | Yes | CIFAR-10 and CIFAR-100. We use Wide Res Net-28-10 (WRN-28-10) (Zagoruyko and Komodakis 2016) as the classifier on CIFAR-10 and CIFAR-100. Clothing1M. Clothing1M is a large-scale benchmark containing real-world noise. |
| Dataset Splits | Yes | We corrupt the training set which has 50000 samples and randomly split 5000 noisy samples for validation. use 1 million noisy samples of Clothing1M for training, 14k and 10k clean data respectively for validation and test. We randomly sample 14k noisy samples (i.e., 1k samples per class) from the 1 million noisy training samples for validation. |
| Hardware Specification | Yes | All models are trained with Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions software components like 'SGD optimizer', 'Cross Entropy (CE) loss', 'Generalized Cross Entropy (GCE)', 'Co-teaching (Co-T)', and 'Determinant based Mutual Information (DMI)' but does not provide specific version numbers for any of these or the underlying programming languages/frameworks. |
| Experiment Setup | Yes | CE, GCE and Co-T shares the same batch size of 128 and learning rate schedule, i.e., training using SGD optimizer for 200 epochs, with a initial learning rate of 0.1, which is decreased by a factor of 5 after 60, 120 and 160 epochs. Following its original paper and official implementation, DMI uses a model pretrained by CE as initialization and requires a larger batch size of 256 and a smaller learning rate, which is tuned in {10 4, 10 5, 10 6} and fixed to 10 6 finally. It is trained using SGD optimizer for 100 epochs without learning rate change. In all methods, the SGD optimizer is implemented with momentum 0.9 and weight decay 5 10 4. |