Subclass-Dominant Label Noise: A Counterexample for the Success of Early Stopping

Authors: Yingbin Bai, Zhongyi Han, Erkun Yang, Jun Yu, Bo Han, Dadong Wang, Tongliang Liu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that Noise Cluster outperforms state-of-the-art baselines on both synthetic and real-world datasets, highlighting the importance of addressing SDN in learning with noisy labels. The code is available at https://github.com/tmllab/2023_Neur IPS_SDN.
Researcher Affiliation Collaboration Yingbin Bai1 Zhongyi Han2 Erkun Yang3 Jun Yu4 Bo Han5 Dadong Wang6 Tongliang Liu1 1Sydney AI Centre, University of Sydney; 2Mohamed bin Zayed University of Artificial Intelligence; 3Xidian University; 4University of Science and Technology of China; 5Hong Kong Baptist University; 6CSIRO
Pseudocode Yes Algorithm 1: Noise Cluster Input: Network fθ; Final layer fξ; Noisy training dataset e D(X, e Y ); Number of epochs for long-trained N; Class number C; DBSCAN Eps and Min Pts. for i = 1, . . . , N do Train fθ and fξ on e D(X, e Y ); // standard training ˆZ t SNE(fθ(X)); for c 1 to C do U c K DBSCAN( ˆZc, Eps, Min Pts); // identify SDN for U c k in U c K do if U c k = largest then Compute set distance with Eq. (1); Update e Y in U c k with Eq. (2) ; Dl Dl U c k; Continually train fθ and fξ on Dl for the rest of the epochs.
Open Source Code Yes The code is available at https://github.com/tmllab/2023_Neur IPS_SDN.
Open Datasets Yes To facilitate research on SDN, we introduce CIFAR20-SDN, a representative SDN dataset built from CIFAR-100, which provides 20 class labels and 100 subclass labels.
Dataset Splits Yes In experiments without SSL, we reserve 10% of the training data as the validation set, while we utilize the entire training data for experiments with SSL.
Hardware Specification Yes All methods run on four core CPU and a single Nvidia V100.
Software Dependencies No No specific software dependencies with version numbers are provided.
Experiment Setup Yes For CIFAR20-SDN, we employ Res Net-34 [19] for experiments without SSL and Pre Act Res Net-18 [20] for experiments with SSL. During optimization, we train the model for 300 epochs, using a learning rate of 2 10 2, a single cycle of cosine annealing [37], a momentum of 0.9, and a weight decay of 5 10 4. We utilize a batch size of 128 and a stopping epoch of 80, with a Close Point value of 20. For DBSCAN hyperparameters, Eps and Min Pts are set to 0.02 and 100, respectively.