Dirichlet-Based Prediction Calibration for Learning with Noisy Labels

Authors: Chen-Chen Zong, Ye-Wen Wang, Ming-Kun Xie, Sheng-Jun Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments on diverse benchmark datasets, we demonstrate that DPC achieves state-of-the-art performance.
Researcher Affiliation Academia College of Computer Science and Technology/Artificial Intelligence, Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing, China {chencz, linuswangg, mkxie, huangsj}@nuaa.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/chenchenzong/DPC.
Open Datasets Yes We experimentally demonstrate the effectiveness of DPC on both synthetic noise datasets (CIFAR-10 and CIFAR-100 (Krizhevsky, Hinton et al. 2009)) and real-world noise datasets (CIFAR-10N, CIFAR-100N (Wei et al. 2021) and Web Vision (Li et al. 2017)).
Dataset Splits No The paper mentions the use of 'Web Vision validation set' and 'Image Net ILSVRC12 validation set' but does not provide specific details on the split percentages, sample counts, or the methodology for creating these validation sets within the context of their experiments, beyond implicitly relying on standard benchmark splits. For CIFAR datasets, it specifies 50K training and 10K test images, but no explicit validation split.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup Yes For CIFAR-10 and CIFAR-100, we use Pre Act Res Net18 (He et al. 2016b) as the base model and train it by stochastic gradient descent (SGD) optimizer with momentum 0.9, weight decay 0.0005, and batch size 128 for 300 epochs. The initial learning rate is set to 0.02 and reduced by a factor of 10 after 150 epochs. The warm-up epoch is 10 for CIFAR-10 and 30 for CIFAR-100. For all experiments, we set β as 0.5 and have γ = 10C.