reproducibilityindex.ai

Early-Learning Regularization Prevents Memorization of Noisy Labels

Authors: Sheng Liu, Jonathan Niles-Weed, Narges Razavian, Carlos Fernandez-Granda

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove that early learning and memorization are fundamental phenomena in high-dimensional classiﬁcation tasks, even in simple linear models, and give a theoretical explanation in this setting. Motivated by these ﬁndings, we develop a new technique for noisy classiﬁcation tasks... The resulting framework is shown to provide robustness to noisy annotations on several standard benchmarks and real-world datasets, where it achieves results comparable to the state of the art.
Researcher Affiliation	Academia	Sheng Liu Center for Data Science New York University shengliu@nyu.edu Jonathan Niles-Weed Center for Data Science, and Courant Inst. of Mathematical Sciences New York University jnw@cims.nyu.edu Narges Razavian Department of Population Health, and Department of Radiology NYU School of Medicine narges.razavian@nyulangone.org Carlos Fernandez-Granda Center for Data Science, and Courant Inst. of Mathematical Sciences New York University cfgranda@cims.nyu.edu
Pseudocode	No	The paper describes the methodology using text and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code to reproduce the experiments is publicly available online at https://github.com/shengliu66/ELR.
Open Datasets	Yes	We evaluate the proposed methodology on two standard benchmarks with simulated label noise, CIFAR-10 and CIFAR-100 [18], and two real-world datasets, Clothing1M [47] and Web Vision [24].
Dataset Splits	Yes	Table G.1 in the supplementary material reports additional details about the datasets, and our training, validation and test splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1).
Experiment Setup	Yes	We focus on two variants of the proposed approach: ELR with temporal ensembling, which we call ELR, and ELR with temporal ensembling, weight averaging, two networks, and mixup data augmentation, which we call ELR+ (see Section F).