Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Authors: Lu Jiang, Di Huang, Mason Liu, Weilong Yang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Performing controlled experiments on noisy data is essential in understanding deep learning across noise levels. Due to the lack of suitable datasets, previous research has only examined deep learning on controlled synthetic label noise, and realworld label noise has never been studied in a controlled setting. |
| Researcher Affiliation | Collaboration | 1Google Research, Mountain View, United States 2Google Cloud AI, Sunnyvale, United States 3Cornell University, Ithaca, United States. |
| Pseudocode | Yes | Algorithm 1 shows the four key steps to compute the loss for a mini-batch: weight (Step 2-4), sample (Step 5 and 8), mixup (Step 9-12), and weight again (Step 14), where the weighting is achieved by the Mentor Net. |
| Open Source Code | Yes | The data and code are released at the following link http://www.lujiang.info/cnlw.html. |
| Open Datasets | Yes | Our benchmark is built on top of two public datasets: Mini Image Net (Vinyals et al., 2016) for coarse-grained image classification and Stanford Cars (Krause et al., 2013) for finegrained image classification. |
| Dataset Splits | Yes | Mini-Image Net has images of size 84x84 with 100 classes from Image Net (Deng et al., 2009). We use all 60K images for training and the 5K images in the ILSVRC12 validation set for testing. Stanford Cars contain 16,185 high-resolution images of 196 classes of cars (Make, Model, Year) split 50-50 into training and validation set. |
| Hardware Specification | Yes | It is worth noting that these findings along with benchmark results are a result of conducting thousands of experiments using tremendous computation power (hundreds of thousands of V100 GPU hours). |
| Software Dependencies | No | We implement Mentor Mix in Algorithm 1 in Tensor Flow. The paper does not specify the version number of TensorFlow or any other software dependencies, which is necessary for reproducible software description. |
| Experiment Setup | Yes | We extensively search the hyperparameter for each method on every noise level. Vanilla is the standard training using l2 weight decay, dropout, and data augmentation. We search the hyperparameter for the weight decay in {e 5, e 4, e 3, e 2} and the dropout ratio in {0.9, 0.8, 0.7, 0.6, 0.5} as suggested in (Arpit et al., 2017). ... We search two hyperparameters α in the γp in the range α = {0.4, 1, 2} and γp = {90%, 80%, 70%}. |