Unsupervised Label Noise Modeling and Loss Correction

Authors: Eric Arazo, Diego Ortego, Paul Albert, Noel O’Connor, Kevin Mcguinness

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on CIFAR10/100 and Tiny Image Net demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art.
Researcher Affiliation Academia Insight Centre for Data Analytics, Dublin City University (DCU), Dublin, Ireland.
Pseudocode No The paper describes mathematical formulations and processes (e.g., Expectation Maximization procedure), but it does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes Source code is available at https://git.io/fjsv E
Open Datasets Yes We thoroughly validate our approach in two well-known image classification datasets: CIFAR-10 and CIFAR-100. ... We further experiment on Tiny Image Net (subset of Image Net (Deng et al., 2009)) and Clothing1M (Xiao et al., 2015) datasets to test the generality of our approach far from CIFAR data.
Dataset Splits Yes Both have 50K color images for training and 10K for validation with resolution 32 × 32. ... Tiny Image Net contains 200 classes with 100K training images, 10K validation, 10K test with resolution 64 × 64, while Clothing1M contains 14 classes with 1M real-world noisy training samples and clean training subsets (47K), validation (14K) and test (10K).
Hardware Specification No The paper mentions using "a Pre Act Res Net-18" and "train it using SGD and batch size of 128" but does not specify the hardware used (e.g., GPU model, CPU, memory).
Software Dependencies No The paper mentions using "Convolutional Neural Networks (CNNs)" and "Pre Act Res Net-18" but does not specify any software libraries or frameworks with version numbers (e.g., TensorFlow, PyTorch, scikit-learn, CUDA versions).
Experiment Setup Yes We use a Pre Act Res Net-18 (He et al., 2016) and train it using SGD and batch size of 128. We use two different schemes for the learning rate policy and number of epochs depending on whether mixup is used (see Appendix B for further details). ... We introduce bootstrapping in epoch 105 for (Reed et al., 2015) for the proposed methods, estimate the T matrix of (Patrini et al., 2017) in epoch 75 (as done in (Hendrycks et al., 2018)), and use the configuration reported in (Zhang et al., 2018) for mixup.