When does Privileged information Explain Away Label Noise?
Authors: Guillermo Ortiz-Jimenez, Mark Collier, Anant Nawalgaria, Alexander Nicholas D’Amour, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments on multiple datasets with real PI (CIFAR-N/H) and a new large-scale benchmark Image Net-PI, we find that PI is most helpful when it allows networks to easily distinguish clean from mislabeled data, while enabling a learning shortcut to memorize the mislabeled examples. |
| Researcher Affiliation | Collaboration | Guillermo Ortiz-Jimenez * 1 Ecole Polytechnique F ed erale de Lausanne (EPFL). Work done during an internship at Google. 2Google Research. |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code can be found at https://github.com/google/ uncertainty-baselines. |
| Open Datasets | Yes | We use relabelled versions of standard image recognition datasets... CIFAR-10/100N. A relabelled version of the CIFAR-10/100 datasets (Krizhevsky, 2009)... CIFAR-10H. An alternative human-relabelled version of CIFAR-10 (Peterson et al., 2019)... Image Net-PI. ...re-labelled version of Image Net (Deng et al., 2009)... The data is publicly available at https://github.com/ google-research-datasets/imagenet_pi. |
| Dataset Splits | Yes | For CIFAR-10N and CIFAR-100N we split the original training set into a training and a validation set; 98% of the examples are used for training and the remaining 2% used as a validation set. ... For distillation models we set the distillation temperature to be 0.5. We use 1% of the original Image Net training set as a validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models (e.g., Intel Xeon), or specific TPU versions used for running experiments. |
| Software Dependencies | No | The paper mentions `tf.keras.applications` and `tensorflow` but does not provide specific version numbers for these or any other software dependencies, such as Python or PyTorch versions, or specific libraries with their versions. |
| Experiment Setup | Yes | All CIFAR models are trained using a SGD optimizer with 0.9 Nestrov momentum for 90 epochs with a batch size of 256. We sweep over an initial learning rate of {0.01, 0.1} with the learning rate decayed by a factor of 0.2 after 27, 54 and 72 epochs. We sweep over an L2 regularization parameter of {0.00001, 0.0001, 0.001}. ... Image Net models are trained using a SGD optimizer with 0.9 Nestrov momentum for 90 epochs with a batch size of 128. We set the initial learning rate of 0.05 with the learning rate decayed by a factor of 0.1 after 30, 60 and 80 epochs. We sweep over an L2 regularization parameter of {0.00001, 0.0001}. |