reproducibilityindex.ai

Detecting Corrupted Labels Without Training a Model to Predict

Authors: Zhaowei Zhu, Zihao Dong, Yang Liu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments with both synthetic and real-world label noise demonstrate our training-free solutions consistently and signiﬁcantly improve most of the training-based baselines.
Researcher Affiliation	Academia	Zhaowei Zhu 1 Zihao Dong 1 Yang Liu 1 1Department of Computer Science and Engineering, University of California, Santa Cruz, CA, USA.
Pseudocode	Yes	Algorithm 1 summarizes our solution.
Open Source Code	Yes	Code is available at github.com/UCSC-REAL/Simi Feat.
Open Datasets	Yes	Experiments with both synthetic and real-world label noise demonstrate our training-free solutions consistently and signiﬁcantly improve most of the training-based baselines. ... We use the 50, 000 noisy training labels (η 0.16) for CIFAR-10 collected by (Zhu et al., 2021b), and 50, 000 noisy training labels (η 0.40) for CIFAR-100 collected by (Wei et al., 2022d). ... For Clothing1M (Xiao et al., 2015)
Dataset Splits	No	The paper discusses using
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory).
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies used in the experiments (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The only hyperparameters in our methods are the number of epochs M and the k-NN parameter k. Intuitively, a larger M returns a collective result from more times of detection, which should be more accurate. But a larger M takes more time. We set M = 21 (an odd number for better tie-breaking) for an efﬁcient solution. The hyperparameter k cannot be set too large as demonstrated in Figure 3. From Figure 3, we notice that the lower bound (RHS ﬁgure) is relatively high when k = 10 for all settings. Therefore, in CIFAR (Krizhevsky et al., 2009) experiments, rather than ﬁne-tune M and k for different settings, we ﬁx M = 21 and k = 10. ... For a fair comparison, we refer to the thresholds learned by conﬁdent learning (Northcutt et al., 2021a). ... The initial learning rate is 0.1 and decays to 0.01 at epoch 50. ... We train the model for 80 epochs with a batch size of 32. We sample 1, 000 mini-batches per epoch randomly selected from 1M training instances. ... The learning rate is 0.002.