reproducibilityindex.ai

Improving generalization by controlling label-noise information in neural network weights

Authors: Hrayr Harutyunyan, Kyle Reing, Greg Ver Steeg, Aram Galstyan

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the effectiveness of our approach on versions of MNIST, CIFAR-10, and CIFAR-100 corrupted with various noise models, and on a large-scale dataset Clothing1M that has noisy labels. We show that methods based on gradient prediction yield drastic improvements over standard training algorithms (like cross-entropy loss), and outperform competitive approaches designed for learning with noisy labels.
Researcher Affiliation	Academia	Hrayr Harutyunyan 1 Kyle Reing 1 Greg Ver Steeg 1 Aram Galstyan 1 1Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292.
Pseudocode	Yes	The pseudocode of LIMIT is presented in the supplementary material (Alg. 1).
Open Source Code	Yes	The implementation of the proposed method and the code for replicating the experiments is available at https://github. com/hrayrhar/limit-label-memorization.
Open Datasets	Yes	We illustrate the effectiveness of our approach on versions of MNIST, CIFAR-10, and CIFAR-100 corrupted with various noise models, and on a large-scale dataset Clothing1M that has noisy labels.
Dataset Splits	Yes	We split the 60K images of MNIST into training and validation sets, containing 48K and 12K samples respectively. We split the 50K images of CIFAR-10 into training and validation sets, containing 40K and 10K samples respectively.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were mentioned in the paper.
Software Dependencies	No	The paper mentions using ADAM optimizer and ResNet-34 networks, but does not provide specific version numbers for software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	We train all baselines except DMI using the ADAM optimizer (Kingma & Ba, 2014) with learning rate = 10 3 and β1 = 0.9. As DMI is very sensitive to the learning rate, we tune it by choosing the best from the following grid of values {10 3, 10 4, 10 5, 10 6}.