Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing
Authors: Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell3812-3821
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, the adaptive weights induce more competitive anytime predictions on multiple recognition data-sets and models than non-adaptive approaches including weighing all losses equally. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University, 2Microsoft Research |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a concrete access statement or link to the source code for the methodology described. |
| Open Datasets | Yes | Data-sets. We experiment on CIFAR10, CIFAR100 (Krizhevsky, Nair, and Hinton 2009), SVHN (Netzer et al. 2011)1 and ILSVRC (Russakovsky et al. 2015)2. Footnote 1: Both CIFAR data-sets consist of 32x32 colored images. CIFAR10 and CIFAR100 have 10 and 100 classes, and each have 50000 training and 10000 testing images. |
| Dataset Splits | Yes | We held out the last 5000 training samples in CIFAR10 and CIFAR100 for validation; the same parameters are then used in other models. |
| Hardware Specification | No | The paper describes experimental setup details such as learning rates and epochs, but it does not specify the hardware (e.g., specific GPU or CPU models) used for running the experiments. |
| Software Dependencies | No | The paper describes the training process and models used, but it does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We optimize the models using stochastic gradient descent, with initial learning rate of 0.1, momentum of 0.9 and a weight decay of 1e-4. On CIFAR and SVHN, we divide the learning rate by 10 at 1/2 and 3/4 of the total epochs. We train for 300 epochs on CIFAR and 60 epochs on SVHN. On ILSVRC, we train for 90 epochs, and divide the learning rate by 10 at epoch 30 and 60. |