Dimensionality-Driven Learning with Noisy Labels
Authors: Xingjun Ma, Yisen Wang, Michael E. Houle, Shuo Zhou, Sarah Erfani, Shutao Xia, Sudanthi Wijewickrema, James Bailey
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate that our approach is highly tolerant to significant proportions of noisy labels, and can effectively learn low-dimensional local subspaces that capture the data distribution. We empirically demonstrate on MNIST, SVHN, CIFAR-10 and CIFAR-100 datasets that our Dimensionality-Driven Learning strategy can effectively learn (1) low-dimensional representation subspaces that capture the underlying data distribution, (2) simpler hypotheses, and (3) high-quality deep representations. We evaluate our proposed D2L learning strategy, comparing the performance of our model with state-of-the-art baselines for noisy label learning. We report the mean test accuracy and standard deviation over 5 repetitions of the experiments in Table 1. |
| Researcher Affiliation | Academia | 1The University of Melbourne, Melbourne, Australia 2Tsinghua University, Beijing, China 3National Institute of Informatics, Tokyo, Japan. |
| Pseudocode | Yes | Algorithm 1 Dimensionality-Driven Learning (D2L) |
| Open Source Code | Yes | The D2L code is available at https://github.com/xingjunm/ dimensionality-driven-learning. |
| Open Datasets | Yes | MNIST (an image data set with 10 categories of handwritten digits (Le Cun et al., 1998)); CIFAR-10 (a natural image data set with 10 categories (Krizhevsky & Hinton, 2009)); SVHN (Netzer et al., 2011); CIFAR-100 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | The paper does not explicitly state any validation dataset splits or sample counts for validation sets. It mentions total epochs and learning rate schedules but no specific validation strategy or data partitioning for validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions training networks using 'SGD' but does not specify any software versions for libraries (e.g., PyTorch, TensorFlow) or operating systems used for the experiments. |
| Experiment Setup | Yes | All networks were trained using SGD with momentum 0.9, weight decay 10-4 and an initial learning rate of 0.1. The learning rate was divided by 10 after epochs 40 and 80 (T = 120 epochs in total). Simple data augmentations (width/height shift and horizontal flip) were applied. For our proposed D2L, we set k = 20 for LID estimation, and used the average LID score over m = 10 random batches of training samples as the overall dimensionality of the representation subspaces. To identify the turning point between the two stages of learning, we employ an epoch window of size w [1, T 1]... |