How does Disagreement Help Generalization against Label Corruption?
Authors: Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor Tsang, Masashi Sugiyama
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on benchmark datasets demonstrate that Co-teaching+ is much superior to many state-of-the-art methods in the robustness of trained models. We conduct experiments on both simulated and real-world noisy datasets, including noisy MNIST, CIFAR-10, CIFAR-100, NEWS, T-Image Net and three Open-sets (Wang et al., 2018). |
| Researcher Affiliation | Collaboration | 1CAI, University of Technology Sydney 2RIKEN-AIP 3Alibaba Damo Academy 4University of Tokyo. |
| Pseudocode | Yes | Algorithm 1: Co-teaching+. Step 4: disagreement-update; Step 5-8: cross-update. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | Datasets. First, we verify the efficacy of our approach on four benchmark datasets (Table 2), including three vision datasets (i.e., MNIST, CIFAR-10, and CIFAR-100) and one text dataset (i.e., NEWS). Then, we verify our approach on a larger and harder dataset called Tiny Image Net (abbreviated as T-Image Net) 1. These datasets are popularly used for the evaluation of learning with noisy labels in the literature (Reed et al., 2015; Goldberger & Ben-Reuven, 2017; Kiryo et al., 2017). |
| Dataset Splits | No | The paper uses standard benchmark datasets but does not explicitly detail the validation dataset splits used for its experiments, only mentioning training and test sets and that 'clean validation data is not available' for a baseline method. |
| Hardware Specification | Yes | we implement all methods with default parameters by Py Torch, and conduct all the experiments on a NVIDIA Titan Xp GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number for it or any other software dependencies. |
| Experiment Setup | Yes | Optimizer. Adam optimizer (momentum=0.9) is with an initial learning rate of 0.001, and the batch size is set to 128 and we run 200 epochs. The learning rate is linearly decayed to zero from 80 to 200 epochs. |