Loss factorization, weakly supervised learning and label noise robustness

Authors: Giorgio Patrini, Frank Nielsen, Richard Nock, Marcello Carioni

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The theory is validated by experiments in which we call the adapted SGD as a black box. ... We analyze experimentally the theory so far developed. ... The next results are based on UCI data. We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). We measure dclean and RD,01, injecting symmetric label noise p [0, 0.45) and averaging over 25 runs. ... We conclude with a systematic study of hold-out error of µSGD. The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. We compare with vanilla SGD run on corrupted sample S and measure the gain from estimating ˆµ S. ... Table 2 reports test error for SGD and µSGD over 25 trials of artificially corrupted datasets.
Researcher Affiliation Collaboration Giorgio Patrini1,2 GIORGIO.PATRINI@ANU.EDU.AU Frank Nielsen3,4 NIELSEN@LIX.POLYTECHNIQUE.FR Richard Nock2,1 RICHARD.NOCK@DATA61.CSIRO.AU Marcello Carioni5 MARCELLO.CARIONI@MIS.MPG.DE Australian National University1, Data612, Ecole Polytechnique3, Sony Computer Science Laboratories Inc4, Max Planck Institute for Mathematics in the Sciences5
Pseudocode Yes Algorithm 1 µSGD... Algorithm 2 µSGD applied on noisy labels
Open Source Code No No statement or link is provided regarding the availability of open-source code for the described methodology.
Open Datasets No The next results are based on UCI data.
Dataset Splits Yes The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set.
Hardware Specification No No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned for the experimental setup.
Software Dependencies No We learn with logistic loss... The learning rate η is untouched from Shalev-Shwartz et al. (2011)... We learn with λ = 10 6 by standard square loss.
Experiment Setup Yes We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). ... we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set.