Training Neural Networks with Local Error Signals
Authors: Arild Nøkland, Lars Hiller Eidnes
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed experiments on MNIST, Fashion-MNIST, Kuzushiji-MNIST, CIFAR-10, CIFAR-100, STL-10 and SVHN to evaluate the performance of the training method. |
| Researcher Affiliation | Collaboration | 1Kongsberg Seatex, Trondheim, Norway 2Trondheim, Norway. Correspondence to: Arild Nøkland <arild.nokland@gmail.com>, Lars H. Eidnes <larseidnes@gmail.com>. |
| Pseudocode | No | The paper describes the methods textually and mathematically but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available.1 1The code for the experiments is available at https:// github.com/anokland/local-loss |
| Open Datasets | Yes | We performed experiments on MNIST, Fashion-MNIST, Kuzushiji-MNIST, CIFAR-10, CIFAR-100, STL-10 and SVHN to evaluate the performance of the training method. |
| Dataset Splits | No | The paper mentions training epochs and reporting test error for the last epoch, but does not explicitly provide details about a validation dataset split (e.g., percentages or counts for a separate validation set). |
| Hardware Specification | No | The paper states 'Despite the large number pf parameters, we were able to train the networks on a single GPU' but does not provide specific details about the GPU model or any other hardware. |
| Software Dependencies | No | The paper mentions using 'Py Torch framework' and 'ADAM' but does not specify their version numbers or other software dependencies with versions. |
| Experiment Setup | Yes | A batch size of 128 was used in all experiments. ADAM was used for optimization (Kingma & Ba, 2014). The weighting factor β was manually tuned and set to 0.99 for all experiments with the predsim loss. [...] The initial learning rate was 5e-4. The average-pooling kernel size for the pred loss was chosen so that the input dimension to the local classifier was 1024. The dropout rate was 0.1 for MLP and 0.2 for VGG8B. For the cutout experiment, the cutout hole size was 14. |