Loss factorization, weakly supervised learning and label noise robustness
Authors: Giorgio Patrini, Frank Nielsen, Richard Nock, Marcello Carioni
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The theory is validated by experiments in which we call the adapted SGD as a black box. ... We analyze experimentally the theory so far developed. ... The next results are based on UCI data. We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). We measure dclean and RD,01, injecting symmetric label noise p [0, 0.45) and averaging over 25 runs. ... We conclude with a systematic study of hold-out error of µSGD. The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. We compare with vanilla SGD run on corrupted sample S and measure the gain from estimating ˆµ S. ... Table 2 reports test error for SGD and µSGD over 25 trials of artificially corrupted datasets. |
| Researcher Affiliation | Collaboration | Giorgio Patrini1,2 GIORGIO.PATRINI@ANU.EDU.AU Frank Nielsen3,4 NIELSEN@LIX.POLYTECHNIQUE.FR Richard Nock2,1 RICHARD.NOCK@DATA61.CSIRO.AU Marcello Carioni5 MARCELLO.CARIONI@MIS.MPG.DE Australian National University1, Data612, Ecole Polytechnique3, Sony Computer Science Laboratories Inc4, Max Planck Institute for Mathematics in the Sciences5 |
| Pseudocode | Yes | Algorithm 1 µSGD... Algorithm 2 µSGD applied on noisy labels |
| Open Source Code | No | No statement or link is provided regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The next results are based on UCI data. |
| Dataset Splits | Yes | The same datasets are now split in 1/5 test and 4/5 training sets once at random. In contrast with the previous experimental setting we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, processor types, or memory amounts) are mentioned for the experimental setup. |
| Software Dependencies | No | We learn with logistic loss... The learning rate η is untouched from Shalev-Shwartz et al. (2011)... We learn with λ = 10 6 by standard square loss. |
| Experiment Setup | Yes | We learn with logistic loss, without model s intercept and set λ = 10 6 and T = 4 2m (4 epochs). ... we perform cross-validation of λ 10{ 3,...,+3} on 5-folds in the training set. |