Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels

Authors: Songhua Wu, Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Nannan Wang, Haifeng Liu, Gang Niu

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Its effectiveness is verified by extensive experiments.
Researcher Affiliation Collaboration 1Trustworthy Machine Learning Lab, School of Computer Science, The University of Sydney 2Department of Computer Science, Hong Kong Baptist University 3School of Mathematics and Statistics, The University of Melbourne 4ISN State Key Laboratory, School of Telecommunications Engineering, Xidian University 5Brain-Inspired Technology Co., Ltd. 6RIKEN Center for Advanced Intelligence Project.
Pseudocode Yes Algorithm 1 Class2Simi Input: training data with noisy class labels; validation data with noisy class labels. Stage 1: Learn ˆTs 1: Learn g(X) = ˆP( Y |X) by training data with noisy class labels, and save the model for Stage 2; 2: Estimate ˆTc following the optimization method in (Patrini et al., 2017); 3: Transform ˆTc to ˆTs. Stage 2: Learn the classifier f(X) = ˆP(Y |X) 4: Load the model saved in Stage 1, and train the whole pipeline showed in Figure 2. Output: classifier f.
Open Source Code No The paper does not provide an explicit statement or a link for open-source code for the described methodology.
Open Datasets Yes We employ three widely used image datasets, i.e., MNIST (Le Cun, 1998), CIFAR-10, and CIFAR100 (Krizhevsky et al., 2009), one text dataset News20, and one real-world noisy dataset Clothing1M (Xiao et al., 2015).
Dataset Splits Yes For all the datasets, we leave out 10% of the training data as a validation set, which is for model selection.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or types of computing resources used for experiments.
Software Dependencies No The paper mentions models and tools like GloVe but does not provide specific software dependencies with version numbers (e.g., library versions for PyTorch, TensorFlow, or scikit-learn).
Experiment Setup No The paper states 'Further details for the experiments are provided in Appendix F.1.' but these details are not present in the provided text. The main text describes the model architectures and noise generation methods but lacks specific training hyperparameters such as learning rates or batch sizes.