Learning to Reweight Examples for Robust Deep Learning
Authors: Mengye Ren, Wenyuan Zeng, Bin Yang, Raquel Urtasun
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To test the effectiveness of our reweighting algorithm, we designed both class imbalance and noisy label settings, and a combination of both, on standard MNIST and CIFAR benchmarks for image classification using deep CNNs. |
| Researcher Affiliation | Collaboration | 1Uber Advanced Technologies Group, Toronto ON, CANADA 2Department of Computer Science, University of Toronto, Toronto ON, CANADA. |
| Pseudocode | Yes | Algorithm 1 Learning to Reweight Examples using Automatic Differentiation |
| Open Source Code | No | The paper does not provide a link to open-source code for the methodology described. |
| Open Datasets | Yes | We use the standard MNIST handwritten digit classification dataset and subsample the dataset to generate a class imbalance binary classification task. (...) We conduct experiments on CIFAR-10 and CIFAR-100. |
| Dataset Splits | Yes | We split the balanced validation set of 10 images directly from the training set. (...) For UNIFORMFLIP, we use 1,000 clean images in the validation set; for BACKGROUNDFLIP, we use 10 clean images per label class. |
| Hardware Specification | No | The paper does not specify any hardware used for the experiments (e.g., GPU/CPU models). |
| Software Dependencies | No | The paper mentions 'popular deep learning frameworks such as Tensor Flow (Abadi et al., 2016)' but does not specify version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | The network is trained with SGD with a learning rate of 1e-3 and mini-batch size of 100 for a total of 8,000 steps. (...) We train the models with SGD with momentum, at an initial learning rate 0.1 and a momentum 0.9 with mini-batch size 100. For Res Net-32 models, the learning rate decays 0.1 at 40K and 60K steps, for a total of 80K steps. For WRN and early stopped versions of Res Net-32 models, the learning rate decays at 40K and 50K steps, for a total of 60K steps. |