Robust Data Pruning under Label Noise via Maximizing Re-labeling Accuracy
Authors: Dongmin Park, Seola Choi, Doyoung Kim, Hwanjun Song, Jae-Gil Lee
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four real noisy datasets, CIFAR-10N, CIFAR-100N, Web Vision, and Clothing-1M, and one synthetic noisy dataset on Image Net-1K show that Prune4Re L consistently outperforms the eight data pruning baselines by up to 9.1%. Moreover, Prune4Re L with Re-labeling models significantly outperforms the data pruning baselines with a standard model by up to 21.6%, which reaffirms the necessity of data pruning with re-labeling. (Abstract) |
| Researcher Affiliation | Collaboration | Dongmin Park1, Seola Choi1, Doyoung Kim1, Hwanjun Song2, Jae-Gil Lee1 1 KAIST, 2 AWS AI Labs |
| Pseudocode | Yes | Algorithm 1 Greedy Neighborhood Confidence (Page 5) |
| Open Source Code | Yes | The code is available at https: //github.com/kaist-dmlab/Prune4Rel. (Section 4.1) |
| Open Datasets | Yes | Datasets. We first perform the data pruning task on four real noisy datasets, CIFAR-10N, CIFAR100N, Webvision, and Clothing-1M. CIFAR-10N and CIFAR-100N [7]... Web Vision [8]... Clothing-1M [9]... Image Net-1K [48]... |
| Dataset Splits | No | The paper does not explicitly provide specific data split information for training, validation, and test sets. It mentions 'selection ratios' for creating subsets and 'test accuracy' but not a detailed, explicit validation split for model training. |
| Hardware Specification | Yes | All methods are implemented with Py Torch 1.8.0 and executed on NVIDIA RTX 3080 GPUs. (Section 4.1) |
| Software Dependencies | Yes | All methods are implemented with Py Torch 1.8.0 and executed on NVIDIA RTX 3080 GPUs. (Section 4.1) |
| Experiment Setup | Yes | The hyperparameters for Divide Mix and SOP+ are favorably configured following the original papers. Following the prior Re-labeling work [13, 33], for CIFAR10N and CIFAR-100N, Pre Act Resnet-18 [51] is trained for 300 epochs using SGD with a momentum of 0.9, a weight decay of 0.0005, and a batch size of 128. The initial learning rate is 0.02, and it is decayed with a cosine annealing scheduler. For Web Vision, Inception Res Net V2 [52] is trained for 100 epochs with a batch size of 32. For Clothing-1M, we use Res Net-50 [53] pre-trained on Image Net and fine-tune it for 10 epochs with a batch size of 32. The initial learning rates of Web Vision and Clothing-1M are 0.02 and 0.002, which are dropped by a factor of 10 at the halfway point of the training epochs. For Image Net-N, Res Net-50 [53] is trained for 50 epochs with a batch size of 64 and an initial learning rate of 0.02 decayed with a cosine annealing scheduler. (Section 4.1) |