Robust Training under Label Noise by Over-parameterization
Authors: Sheng Liu, Zhihui Zhu, Qing Qu, Chong You
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally demonstrate the effectiveness of our proposed SOP method on datasets with both synthetic (i.e., CIFAR-10 and CIFAR-100) and realistic (i.e., CIFAR-N, Clothing-1M, and Web Vision) label noise. |
| Researcher Affiliation | Collaboration | Sheng Liu 1 Zhihui Zhu 2 Qing Qu 3 Chong You 4 1Center for Data Science, New York University 2Electrical and Computer Engineering, University of Denver 3Department of EECS, University of Michigan 4Google Research, New York City. |
| Pseudocode | Yes | Algorithm 1 Image classification under label noise by the method of Sparse Over-Parameterization (SOP). |
| Open Source Code | Yes | Code is available at https: //github.com/shengliu66/SOP. |
| Open Datasets | Yes | Dataset descriptions. We use datasets with synthetic label noise generated from CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009). ... For datasets with realistic label noise, we test on CIFAR-10N/CIFAR-100N (Wei et al., 2021b) ... We also test on Clothing1M (Xiao et al., 2015) ... Finally, we also test on the mini Web Vision dataset (Li et al., 2017) |
| Dataset Splits | Yes | Each dataset [CIFAR-10/100] contains 50k training images and 10k test images... Clothing-1M contains 1 million training images, 15k validation images, and 10k test images with clean labels. |
| Hardware Specification | Yes | Finally, we compare the training time (on a single Nvidia V100 GPU) of our method to the baseline methods in Table 5. |
| Software Dependencies | Yes | We implement our method with Py Torch v1.7. |
| Experiment Setup | Yes | Network structures & hyperparameters. We implement our method with Py Torch v1.7. For each dataset, the choices of network architectures and hyperparameters for SOP are as follows. Additional details, as well as hyper-parameters for both SOP and SOP+, can be found in Appendix A.4. ... We follow (Liu et al., 2020) to use Res Net-34 and Pre Act Res Net18 architectures trained with SGD using a 0.9 momentum. The initial learning rate is 0.02 decayed with a factor of 10 at the 40th and 80th epochs... Weight decay for network parameters θ is set to 5 10 4. No weight decay is used for parameters {ui, vi}N i=1. |