Curriculum Loss: Robust Learning and Generalization against Label Corruption

Authors: Yueming Lyu, Ivor W. Tsang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark datasets validate the robustness of the proposed loss.
Researcher Affiliation Academia Yueming Lyu & Ivor W. Tsang Centre for Artificial Intelligence, University of Technology Sydney yueminglyu@gmail.com, Ivor.Tsang@uts.edu.au
Pseudocode Yes Algorithm 1 Partial Optimization; Algorithm 2 Training with Batch Noise Pruned Curriculum Loss
Open Source Code No The paper states, "We implement NPCL by Pytorch." However, it does not explicitly state that the source code for their proposed method (NPCL) is publicly available via a link or explicit release statement.
Open Datasets Yes We evaluate our NPCL by comparing Generalized Cross-Entropy (GCE) loss (Zhang & Sabuncu, 2018), Co-teaching (Han et al., 2018b), Co-teaching+ (Yu et al., 2019), Mentor Net (Jiang et al., 2018) and standard network training on MNIST, CIFAR10 and CIFAR100 dataset as in (Han et al., 2018b; Patrini et al., 2017; Goldberger & Ben-Reuven, 2017).
Dataset Splits No The paper mentions using MNIST, CIFAR10, and CIFAR100 datasets and evaluating on the test accuracy. However, it does not explicitly provide details about specific training/validation/test splits (e.g., percentages, sample counts, or explicit references to predefined validation splits for these datasets) beyond general mention of the datasets themselves.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper states, "We implement NPCL by Pytorch." but does not provide a version number for PyTorch or any other software dependencies, which is necessary for reproducible description.
Experiment Setup Yes Specifically, the batch size and the number of epochs is set to m = 128 and N = 200, respectively. The Adam optimizer with the same parameter as (Han et al., 2018b) is employed. ... For NPCL, we employ hinge loss as the base upper bound function of 0-1 loss. In the first few epochs, we train model using full batch with soft hinge loss (in the supplement) as a burn-in period suggested in (Jiang et al., 2018). Specifically, we start NPCL at 5th epoch on MNIST and 10th epoch on CIFAR10 and CIFAR100, respectively. Appendix L provides detailed network architectures.