Noisy Label Learning with Instance-Dependent Outliers: Identifiability via Crowd Wisdom

Authors: Tri Nguyen, Shahana Ibrahim, Xiao Fu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our learning scheme substantially improves outlier detection and the classifier s testing accuracy. We evaluate the proposed method over a number real datasets that are annotated by machine and human annotators under various conditions and observed nontrivial improvements of testing accuracy. 5 Experiments
Researcher Affiliation Academia Tri Nguyen School of EECS Oregon State University Corvallis, Oregon, USA nguyetr9@oregonstate.edu Shahana Ibrahim Department of ECE University of Central Florida Orlando, Florida, USA shahana.ibrahim@ucf.edu Xiao Fu School of EECS Oregon State University Corvallis, Oregon, USA xiao.fu@oregonstate.edu
Pseudocode No The paper describes algorithms and implementation but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes We release the code and our acquired noisy annotations at https://github.com/ductri/COINNet.
Open Datasets Yes Dataset. We consider the CIFAR-10 [62] and the STL-10 datasets [63] see Appendix G for details. CIFAR-10N. The first dataset that we use is the CIFAR-10N dataset [66]. Label Me. We also test the algorithms over the Label Me dataset [67,68]. Image Net-15N. In addition to existing datasets, we also acquire noisy annotations by asking AMT workers to annotate some images from Image Net. ... We release the code and our acquired noisy annotations at https://github.com/ductri/COINNet.
Dataset Splits Yes The CIFAR-10 dataset consists of 60, 000 labeled color images... The images are split into training and testing sets with size 50,000 and 10,000, respectively. ... We randomly split the training set into 47,500 and 2,500 to use as train and validation set for all methods. The validation set comprises 500 images, while the remaining 1,188 images are reserved for testing. (for Label Me)
Hardware Specification Yes All runs have been conducted using either Nvidia A40 or Nvidia DGX H100 GPU.
Software Dependencies No The paper mentions using 'Adam' as an optimizer and 'Res Net-34', 'Res Net-9', 'VGG-16', and 'CLIP' as architectures/models, but does not specify version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, scikit-learn) required for replication.
Experiment Setup Yes For our proposed approach COINNet, we fix ζ = 10 10, p = 0.4, and µ1 = µ2 = 0.01. Adam [65] is used as the optimizer with weight decay of 10 4, learning rate of 0.01, and batch size of 512. We train with batch size of 512, number of epochs 200, Adam optimizer with learning rate of 0.01 and learning rate scheduler One Cycle LR [86].