reproducibilityindex.ai

Noise Attention Learning: Enhancing Noise Robustness by Gradient Scaling

Authors: Yangdi Lu, Yang Bo, Wenbo He

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that most of the mislabeled samples yield signiﬁcantly lower weights than the clean ones. Furthermore, our theoretical analysis shows that the gradients of training samples are dynamically scaled by the attention weights, implicitly preventing memorization of the mislabeled samples. Experimental results on two benchmarks (CIFAR-10 and CIFAR-100) with simulated label noise and three realworld noisy datasets (ANIMAL-10N, Clothing1M and Webvision) demonstrate that our approach outperforms state-of-the-art methods.
Researcher Affiliation	Academia	Yangdi Lu Department of Computing and Software Mc Master University luy100@mcmaster.ca Yang Bo Department of Computing and Software Mc Master University boy2@mcmaster.ca Wenbo He Department of Computing and Software Mc Master University hew11@mcmaster.ca
Pseudocode	Yes	Algorithm 1 Noise Attention Learning (NAL) pseudocode
Open Source Code	No	The paper does not provide concrete access (e.g., a specific repository link or explicit statement of code release) to the source code for the methodology described.
Open Datasets	Yes	We evaluate our approach on two benchmarks CIFAR-10 and CIFAR-100 [2] with simulated label noise, and three real-world datasets, ANIMAL-10N [15], Clothing1M [16] and Web Vision [3].
Dataset Splits	Yes	All the compared methods are evaluated on Web Vision and Image Net ILSVRC12 validation sets.
Hardware Specification	Yes	All experiments are implemented in Py Torch and run in a single NIVDIA A100 GPU.
Software Dependencies	No	The paper states 'All experiments are implemented in Py Torch' but does not provide specific version numbers for PyTorch or other software dependencies.
Experiment Setup	Yes	For CIFAR with class-conditional noise, we use a Res Net-34 [39] and train it using SGD with a batch size of 64. For CIFAR-10 and ANIMAL-10N, we set λ = 0.5. For CIFAR-100, we set λ = 10. For Webvision and Clothing1M, we set λ = 50.