Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing

Authors: Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell3812-3821

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, the adaptive weights induce more competitive anytime predictions on multiple recognition data-sets and models than non-adaptive approaches including weighing all losses equally.
Researcher Affiliation Collaboration 1Carnegie Mellon University, 2Microsoft Research
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a concrete access statement or link to the source code for the methodology described.
Open Datasets Yes Data-sets. We experiment on CIFAR10, CIFAR100 (Krizhevsky, Nair, and Hinton 2009), SVHN (Netzer et al. 2011)1 and ILSVRC (Russakovsky et al. 2015)2. Footnote 1: Both CIFAR data-sets consist of 32x32 colored images. CIFAR10 and CIFAR100 have 10 and 100 classes, and each have 50000 training and 10000 testing images.
Dataset Splits Yes We held out the last 5000 training samples in CIFAR10 and CIFAR100 for validation; the same parameters are then used in other models.
Hardware Specification No The paper describes experimental setup details such as learning rates and epochs, but it does not specify the hardware (e.g., specific GPU or CPU models) used for running the experiments.
Software Dependencies No The paper describes the training process and models used, but it does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We optimize the models using stochastic gradient descent, with initial learning rate of 0.1, momentum of 0.9 and a weight decay of 1e-4. On CIFAR and SVHN, we divide the learning rate by 10 at 1/2 and 3/4 of the total epochs. We train for 300 epochs on CIFAR and 60 epochs on SVHN. On ILSVRC, we train for 90 epochs, and divide the learning rate by 10 at epoch 30 and 60.