Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Improving generalization by controlling label-noise information in neural network weights
Authors: Hrayr Harutyunyan, Kyle Reing, Greg Ver Steeg, Aram Galstyan
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the effectiveness of our approach on versions of MNIST, CIFAR-10, and CIFAR-100 corrupted with various noise models, and on a large-scale dataset Clothing1M that has noisy labels. We show that methods based on gradient prediction yield drastic improvements over standard training algorithms (like cross-entropy loss), and outperform competitive approaches designed for learning with noisy labels. |
| Researcher Affiliation | Academia | Hrayr Harutyunyan 1 Kyle Reing 1 Greg Ver Steeg 1 Aram Galstyan 1 1Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292. |
| Pseudocode | Yes | The pseudocode of LIMIT is presented in the supplementary material (Alg. 1). |
| Open Source Code | Yes | The implementation of the proposed method and the code for replicating the experiments is available at https://github. com/hrayrhar/limit-label-memorization. |
| Open Datasets | Yes | We illustrate the effectiveness of our approach on versions of MNIST, CIFAR-10, and CIFAR-100 corrupted with various noise models, and on a large-scale dataset Clothing1M that has noisy labels. |
| Dataset Splits | Yes | We split the 60K images of MNIST into training and validation sets, containing 48K and 12K samples respectively. We split the 50K images of CIFAR-10 into training and validation sets, containing 40K and 10K samples respectively. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions using ADAM optimizer and ResNet-34 networks, but does not provide specific version numbers for software dependencies like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | We train all baselines except DMI using the ADAM optimizer (Kingma & Ba, 2014) with learning rate = 10 3 and β1 = 0.9. As DMI is very sensitive to the learning rate, we tune it by choosing the best from the following grid of values {10 3, 10 4, 10 5, 10 6}. |