Label Noise: Ignorance Is Bliss
Authors: Yilun Zhu, Jianxin Zhang, Aditya Gangrade, Clay Scott
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We establish a new theoretical framework for learning under multi-class, instancedependent label noise. ... Finally, we translate this theoretical insight into practice: by using NI-ERM to fit a linear classifier on top of a self-supervised feature extractor, we achieve state-of-the-art performance on the CIFAR-N data challenge. ... We conducted experiments 1 on the CIFAR image data under two scenarios: synthetic label flipping (symmetric noise) and realistic human label errors [Wei et al., 2022], as shown in Figure 3. |
| Researcher Affiliation | Academia | Yilun Zhu EECS University of Michigan allanzhu@umich.edu Jianxin Zhang EECS University of Michigan jianxinz@umich.edu Aditya Gangrade ECE Boston University gangrade@bu.edu Clayton Scott EECS, Statistics University of Michigan clayscot@umich.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/allan-z/label_noise_ignorance. ... Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: Code is provided, common benchmark datase were used, instructions are given, the result is easily reproducible. |
| Open Datasets | Yes | CIFAR-N data challenge. ... We conducted experiments 1 on the CIFAR image data under two scenarios: synthetic label flipping (symmetric noise) and realistic human label errors [Wei et al., 2022], as shown in Figure 3. ... MNIST (http://yann.lecun.com/exdb/mnist/) ... CIFAR-10 (https://www.cs.toronto.edu/~kriz/cifar. html) |
| Dataset Splits | Yes | We prespecify a range of values for ℓ2 regularization ({0.0001, 0.001, 0.01, 0.1, 1, 10, 100} ) and number of iterations for lbfgs optimizer ({10, 20, 50, 100}), then do cross-validation on noisy data to pick the best hyper-parameters. |
| Hardware Specification | Yes | The experiment was conducted on AMD Ryzen 5 3600 CPU. ... The experiments were conducted on a single NVIDIA GTX 1660S GPU. ... The experiments were conducted on a single NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions software components like "Sklearn's logistic regression" and "Pytorch model library" but does not provide specific version numbers for these software dependencies, which are required for full reproducibility. |
| Experiment Setup | Yes | We prespecify a range of values for ℓ2 regularization ({0.0001, 0.001, 0.01, 0.1, 1, 10, 100} ) and number of iterations for lbfgs optimizer ({10, 20, 50, 100}), then do cross-validation on noisy data to pick the best hyper-parameters. |