Vision-Language Models are Strong Noisy Label Detectors
Authors: Tong Wei, Hao-Tian Li, ChunShu Li, Jiang-Xin Shi, Yu-Feng Li, Min-Ling Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on seven synthetic and real-world noisy datasets validate the effectiveness of DEFT in both noisy label detection and image classification tasks. |
| Researcher Affiliation | Academia | Tong Wei1,2,3 Hao-Tian Li1,2 Chun-Shu Li1,2 Jiang-Xin Shi3,4 Yu-Feng Li3,4 Min-Ling Zhang1,2 1School of Computer Science and Engineering, Southeast University, Nanjing, China 2Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China 3National Key Laboratory for Novel Software Technology, Nanjing University, China 4School of Artificial Intelligence, Nanjing University, China {weit,liht}@seu.edu.cn |
| Pseudocode | Yes | Algorithm 1: The Proposed DEFT Framework |
| Open Source Code | Yes | Our source code is available at https://github.com/HotanLee/DeFT. |
| Open Datasets | Yes | We conduct experimental analyses on widely-used CIFAR-100 [23] and Tiny-Image Net [42], as well as two fine-grained datasets Stanford-Cars [22] and CUB-200-2011 [34]... We further examine the performance of DEFT on three real-world noisy label datasets: 1) CIFAR-100N [39]... 2) Clothing1M [47]... 3) Web Vision [26]. |
| Dataset Splits | No | The paper mentions 'training' and 'test accuracy' but does not explicitly define or refer to a separate 'validation' dataset split for hyperparameter tuning or early stopping. |
| Hardware Specification | Yes | All experiments are conducted on a single NVIDIA Ge Force RTX 3090. |
| Software Dependencies | No | The paper mentions models and optimizers (e.g., CLIP, Vi T-B/16, SGD, Adam W) but does not provide specific version numbers for software libraries or dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We use the SGD optimizer with a momentum of 0.9, a weight decay of 5e-4, and a batch size of 64. We run 10 epochs for both the noisy label detection phase and the model adaptation phase with learning rates 3e-2 and 5e-4, respectively. In the noisy label detection phase, we employ VPT [18] and Co Op [53] to adapt visual encoder and textual encoder respectively, and perform model warm-up for 1 epoch on all datasets. |