reproducibilityindex.ai

Active Negative Loss Functions for Learning with Noisy Labels

Authors: Xichen Ye, Xiaoqiang Li, songmin dai, Tong Liu, Yan Sun, Weiqin Tong

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark and real-world datasets demonstrate that the new set of loss functions created by our ANL framework can outperform state-of-the-art methods.
Researcher Affiliation	Academia	Xichen Ye Shanghai University Shanghai, China yexichen0930@shu.edu.cn Xiaoqiang Li Shanghai University Shanghai, China xqli@shu.edu.cn Songmin Dai Shanghai University Shanghai, China laodar@shu.edu.cn Tong Liu Shanghai University Shanghai, China tong_liu@shu.edu.cn Yan Sun Shanghai University Shanghai, China yansun@shu.edu.cn Weiqin Tong Shanghai University Shanghai, China wqtong@shu.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. Methods are described in prose and mathematical formulations.
Open Source Code	Yes	The code is available at https://github.com/Virusdoll/Active-Negative-Loss.
Open Datasets	Yes	In this section, we empirically investigate our proposed ANL functions on benchmark datasets, including MNIST [12], CIFAR-10/CIFAR-100 [13] and a real-world noisy dataset Web Vision [14].
Dataset Splits	Yes	Specifically, we use 10% of the original training set as the validation set, and generate 0.8 symmetric noise on the remaining 90% of the original training set as the training set by the standard noise generation approach.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running its experiments. It only mentions general aspects like 'training deep neural networks'.
Software Dependencies	No	The paper mentions optimizers like SGD and Adam but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used for implementation.
Experiment Setup	Yes	For MNIST, CIFAR-10, and CIFAR-100, the networks are trained for 50, 120, and 200 epochs, respectively. For all the training, we use SGD optimizer with momentum 0.9 and cosine learning rate annealing. Weight decay is set to 1 × 10−3, 1 × 10−4, and 1 × 10−5 for MNIST, CIFAR-10, and CIFAR-100, respectively. (...) The initial learning rate is set to 0.01 for MNIST/CIFAR-10 and 0.1 for CIFAR-100. Batch size is set to 128. For all settings, we clip the gradient norm to 5.0. Typical data augmentations including random width/height shift and horizontal flip are applied.