When does Privileged information Explain Away Label Noise?

Authors: Guillermo Ortiz-Jimenez, Mark Collier, Anant Nawalgaria, Alexander Nicholas D’Amour, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments on multiple datasets with real PI (CIFAR-N/H) and a new large-scale benchmark Image Net-PI, we find that PI is most helpful when it allows networks to easily distinguish clean from mislabeled data, while enabling a learning shortcut to memorize the mislabeled examples.
Researcher Affiliation Collaboration Guillermo Ortiz-Jimenez * 1 Ecole Polytechnique F ed erale de Lausanne (EPFL). Work done during an internship at Google. 2Google Research.
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code can be found at https://github.com/google/ uncertainty-baselines.
Open Datasets Yes We use relabelled versions of standard image recognition datasets... CIFAR-10/100N. A relabelled version of the CIFAR-10/100 datasets (Krizhevsky, 2009)... CIFAR-10H. An alternative human-relabelled version of CIFAR-10 (Peterson et al., 2019)... Image Net-PI. ...re-labelled version of Image Net (Deng et al., 2009)... The data is publicly available at https://github.com/ google-research-datasets/imagenet_pi.
Dataset Splits Yes For CIFAR-10N and CIFAR-100N we split the original training set into a training and a validation set; 98% of the examples are used for training and the remaining 2% used as a validation set. ... For distillation models we set the distillation temperature to be 0.5. We use 1% of the original Image Net training set as a validation set.
Hardware Specification No The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models (e.g., Intel Xeon), or specific TPU versions used for running experiments.
Software Dependencies No The paper mentions `tf.keras.applications` and `tensorflow` but does not provide specific version numbers for these or any other software dependencies, such as Python or PyTorch versions, or specific libraries with their versions.
Experiment Setup Yes All CIFAR models are trained using a SGD optimizer with 0.9 Nestrov momentum for 90 epochs with a batch size of 256. We sweep over an initial learning rate of {0.01, 0.1} with the learning rate decayed by a factor of 0.2 after 27, 54 and 72 epochs. We sweep over an L2 regularization parameter of {0.00001, 0.0001, 0.001}. ... Image Net models are trained using a SGD optimizer with 0.9 Nestrov momentum for 90 epochs with a batch size of 128. We set the initial learning rate of 0.05 with the learning rate decayed by a factor of 0.1 after 30, 60 and 80 epochs. We sweep over an L2 regularization parameter of {0.00001, 0.0001}.