reproducibilityindex.ai

Delving into Noisy Label Detection with Clean Data

Authors: Chenglin Yu, Xinsong Ma, Weiwei Liu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	BHN achieves state-of-the-art performance and outperforms baselines by 28.48% in terms of false discovery rate (FDR) and by 18.99% in terms of F1 on CIFAR-10. Extensive ablation studies further demonstrate the superiority of BHN. Our code is available at https://github.com/Chenglin Yu/BHN. 4. Experiments In this section, we ﬁrst introduce implementation details and baselines for experiments in Section 4.1. Then, we provide quantitative performance results in Sections 4.2 and 4.3.
Researcher Affiliation	Academia	Chenglin Yu 1 Xinsong Ma 1 Weiwei Liu 1 1School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artiﬁcial Intelligence and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, China.
Pseudocode	Yes	Algorithm 1 Noisy Label Detection via Multiple Testing 1: Input: Clean data D0. Noisy data: e D1 = {(xi, yi)}i [N]. 2: Split D0 into two sets of even size Dtrain 0 and Dcal 0 . 3: Train the neural network f using Dtrain 0 4: Calculate the p-values of e D1. 5: Using BH procedure to determine whether to reject or accept for each null hypothesis. The examples corresponding to the rejected hypotheses are regarded as corrupted examples.
Open Source Code	Yes	Our code is available at https://github.com/Chenglin Yu/BHN.
Open Datasets	Yes	We evaluate BHN on three benchmark datasets: CIFAR-10, CIFAR100 (Krizhevsky et al., 2009), and Clothing1M (Xiao et al., 2015).
Dataset Splits	Yes	We use 14,313 clean validation images as the calibration set.
Hardware Specification	No	The paper mentions using specific neural network architectures (e.g., ResNet34, ResNet50) as backbones but does not provide details on the hardware (e.g., GPU models, CPU types, memory) used for the experiments.
Software Dependencies	No	The paper mentions using Stochastic Gradient Descent (SGD) and neural networks but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	During training the model, we set a batch size of 128. We use Stochastic Gradient Descent (SGD) with a weight decay 5 10 4 and a momentum of 0.9. We train the model for 200 epochs. We set the initial learning rate of 0.1 and decrease it by the factor of 10 after 160 epochs.