Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Negative Loss Functions for Learning with Noisy Labels

Authors: Xichen Ye, Xiaoqiang Li, songmin dai, Tong Liu, Yan Sun, Weiqin Tong

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark and real-world datasets demonstrate that the new set of loss functions created by our ANL framework can outperform state-of-the-art methods.
Researcher Affiliation Academia Xichen Ye Shanghai University Shanghai, China EMAIL Xiaoqiang Li Shanghai University Shanghai, China EMAIL Songmin Dai Shanghai University Shanghai, China EMAIL Tong Liu Shanghai University Shanghai, China EMAIL Yan Sun Shanghai University Shanghai, China EMAIL Weiqin Tong Shanghai University Shanghai, China EMAIL
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Methods are described in prose and mathematical formulations.
Open Source Code Yes The code is available at https://github.com/Virusdoll/Active-Negative-Loss.
Open Datasets Yes In this section, we empirically investigate our proposed ANL functions on benchmark datasets, including MNIST [12], CIFAR-10/CIFAR-100 [13] and a real-world noisy dataset Web Vision [14].
Dataset Splits Yes Specifically, we use 10% of the original training set as the validation set, and generate 0.8 symmetric noise on the remaining 90% of the original training set as the training set by the standard noise generation approach.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running its experiments. It only mentions general aspects like 'training deep neural networks'.
Software Dependencies No The paper mentions optimizers like SGD and Adam but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used for implementation.
Experiment Setup Yes For MNIST, CIFAR-10, and CIFAR-100, the networks are trained for 50, 120, and 200 epochs, respectively. For all the training, we use SGD optimizer with momentum 0.9 and cosine learning rate annealing. Weight decay is set to 1 × 10−3, 1 × 10−4, and 1 × 10−5 for MNIST, CIFAR-10, and CIFAR-100, respectively. (...) The initial learning rate is set to 0.01 for MNIST/CIFAR-10 and 0.1 for CIFAR-100. Batch size is set to 128. For all settings, we clip the gradient norm to 5.0. Typical data augmentations including random width/height shift and horizontal flip are applied.