On Symmetric Losses for Learning from Corrupted Labels

Authors: Nontawat Charoenphakdee, Jongyeong Lee, Masashi Sugiyama

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental First, we emphasize that using a symmetric loss is advantageous in the balanced error rate (BER) minimization and area under the receiver operating characteristic curve (AUC) maximization from corrupted labels. Second, we prove general theoretical properties of symmetric losses, including a classificationcalibration condition, excess risk bound, conditional risk minimizer, and AUC-consistency condition. Third, since all nonnegative symmetric losses are non-convex, we propose a convex barrier hinge loss that benefits significantly from the symmetric condition, although it is not symmetric everywhere. Finally, we conduct experiments to validate the relevance of the symmetric condition.
Researcher Affiliation Academia 1Department of Computer Science, The University of Tokyo, Tokyo, Japan 2RIKEN Center of Artificial Intelligence Project, Tokyo, Japan.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states that the experiment code was implemented with specific frameworks (Chainer, PyTorch) but does not provide concrete access (e.g., a repository link or an explicit statement of code release) for the authors' implementation of the methodology.
Open Datasets Yes We used datasets from the UCI machine learning repository (Lichman et al., 2013) and LIBSVM (Chang & Lin, 2011). We used MNIST (Le Cun, 1998) (Odd vs Even) and CIFAR-10 (Airplane vs Horse) (Krizhevsky & Hinton, 2009) as the datasets.
Dataset Splits Yes Training data consists of 500 corrupted positive data, 500 corrupted negative data, and balanced 500 clean test data.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions using multilayer perceptrons and convolutional neural networks.
Software Dependencies No The experiment code was implemented with Chainer (Tokui et al., 2015). The experiment code was implemented with Py Torch (Paszke et al., 2017). The objective functions of the neural networks were optimized using AMSGRAD (Reddi et al., 2018). While software names are mentioned, specific version numbers for these software components are not provided.
Experiment Setup Yes We used the one hidden layer multilayer perceptron d 500 1 as a model. The objective functions of the neural networks were optimized using AMSGRAD (Reddi et al., 2018). For fairness, we fix b = 200 and r = 50 for all datasets in the experiment section.