Decoupling the Layers in Residual Networks
Authors: Ricky Fok, Aijun An, Zana Rashidi, Xiaogang Wang
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate through an extensive performance study that the proposed network achieves comparable predictive performance to the original residual network with the same number of parameters, while achieving a significant speed-up on the total training time. |
| Researcher Affiliation | Academia | Ricky Fok , Aijun An, Zana Rashidi Department of Electrical Engineering and Computer Science York University 4700 Keele Street, Toronto, M3J 1P3, Canada ricky.fok3@gmail.com, aan@cse.yorku.ca, zrashidi@cse.yorku.ca Xiaogang Wang Department of Mathematics and Statistics York University 4700 Keele Street, Toronto, M3J 1P3, Canada stevenw@mathstat.yorku.ca |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper only acknowledges third-party code: 'We thank Wenxin Xu for providing his code for Res Net at https://github.com/wenxinxu/resnet_in_tensorflow.' It does not provide concrete access to the authors' own source code for the methodology described. |
| Open Datasets | Yes | For the CIFAR-10 and CIFAR-100 data sets, we trained for 80000 iterations, or 204 epochs. We also tested Warp Net on a down-sampled (32x32) Image Net data set (Chrabaszcz & Hutter, 2017). |
| Dataset Splits | Yes | For the CIFAR-10 and CIFAR-100 data sets, we trained for 80000 iterations, or 204 epochs. We took a training batch size of 128. Initial learning rate is 0.1. The learning rate drops by a factor of 0.1 at epochs 60, 120, and 160, with a weight decay of 0.0005. The data set contains 1000 classes with 1281167 training images and 50000 validation images with 50 images each class. |
| Hardware Specification | No | The paper mentions 'GPUs' and memory constraints ('requires too much memory on a single GPU') but does not specify exact GPU models, CPU models, or other detailed computer specifications used for experiments. |
| Software Dependencies | No | The paper mentions using 'Tensorflow' for implementation but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | For the CIFAR-10 and CIFAR-100 data sets, we trained for 80000 iterations, or 204 epochs. We took a training batch size of 128. Initial learning rate is 0.1. The learning rate drops by a factor of 0.1 at epochs 60, 120, and 160, with a weight decay of 0.0005. The training batch size is 512, initial learning rate is 0.4 and drops by a factor of 0.1 at every 30 epochs. The weight decay is set to be 0.0001. |