Learning to Reweight Examples for Robust Deep Learning

Authors: Mengye Ren, Wenyuan Zeng, Bin Yang, Raquel Urtasun

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To test the effectiveness of our reweighting algorithm, we designed both class imbalance and noisy label settings, and a combination of both, on standard MNIST and CIFAR benchmarks for image classification using deep CNNs.
Researcher Affiliation Collaboration 1Uber Advanced Technologies Group, Toronto ON, CANADA 2Department of Computer Science, University of Toronto, Toronto ON, CANADA.
Pseudocode Yes Algorithm 1 Learning to Reweight Examples using Automatic Differentiation
Open Source Code No The paper does not provide a link to open-source code for the methodology described.
Open Datasets Yes We use the standard MNIST handwritten digit classification dataset and subsample the dataset to generate a class imbalance binary classification task. (...) We conduct experiments on CIFAR-10 and CIFAR-100.
Dataset Splits Yes We split the balanced validation set of 10 images directly from the training set. (...) For UNIFORMFLIP, we use 1,000 clean images in the validation set; for BACKGROUNDFLIP, we use 10 clean images per label class.
Hardware Specification No The paper does not specify any hardware used for the experiments (e.g., GPU/CPU models).
Software Dependencies No The paper mentions 'popular deep learning frameworks such as Tensor Flow (Abadi et al., 2016)' but does not specify version numbers for TensorFlow or any other software dependencies.
Experiment Setup Yes The network is trained with SGD with a learning rate of 1e-3 and mini-batch size of 100 for a total of 8,000 steps. (...) We train the models with SGD with momentum, at an initial learning rate 0.1 and a momentum 0.9 with mini-batch size 100. For Res Net-32 models, the learning rate decays 0.1 at 40K and 60K steps, for a total of 80K steps. For WRN and early stopped versions of Res Net-32 models, the learning rate decays at 40K and 50K steps, for a total of 60K steps.