Orthogonalized SGD and Nested Architectures for Anytime Neural Networks
Authors: Chengcheng Wan, Henry Hoffmann, Shan Lu, Michael Maire
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments, Tables 1 and 2 show the validation error rates of applying five different optimizers to different anytime networks. We begin with evaluation using the CIFAR-10 dataset (Krizhevsky & Hinton, 2012). |
| Researcher Affiliation | Academia | 1University of Chicago, Chicago, IL, USA. Correspondence to: Chengcheng Wan <cwan@uchicago.edu>. |
| Pseudocode | Yes | Algorithm 1 Greedy stage-wise multitask training, Algorithm 2 Orthogonalized SGD: A multitask variant of SGD with optional dynamic normalization of task influence. |
| Open Source Code | No | No explicit statement or link regarding the release of source code for the described methodology is provided in the paper. |
| Open Datasets | Yes | We begin with evaluation using the CIFAR-10 dataset (Krizhevsky & Hinton, 2012). and large-scale Image Net (ILSVRC 2012) dataset (Deng et al., 2009) |
| Dataset Splits | No | The paper mentions 'validation error' and 'validation error rates' but does not provide specific details on how the validation set was created (e.g., split percentages or counts) or explicitly refer to a standard validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers (e.g., 'Python 3.x', 'PyTorch 1.x') that would allow for reproducible setup. |
| Experiment Setup | Yes | All networks are trained for 200 epochs, with learning rate decreasing from 0.1 to 0.0008. and All networks are trained for 90 epochs, with learning rate decreasing from 0.1 to 0.0001. |