Learning Implicitly Recurrent CNNs Through Parameter Sharing
Authors: Pedro Savarese, Michael Maire
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate substantial parameter savings on standard image classification tasks, while maintaining accuracy. ... Table 1: Test error (%) on CIFAR-10 and CIFAR100. ... Table 3: Image Net classification results |
| Researcher Affiliation | Academia | Pedro Savarese TTI-Chicago savarese@ttic.edu Michael Maire University of Chicago mmaire@uchicago.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/lolemacs/soft-sharing |
| Open Datasets | Yes | The CIFAR-10 and CIFAR-100 datasets (Krizhevsky, 2009) are composed of 60, 000 colored 32 32 images, labeled among 10 and 100 classes respectively, and split into 50, 000 and 10, 000 examples for training and testing. We use the ILSVRC 2012 dataset (Russakovsky et al., 2015) as a stronger test of our method. It is composed of 1.2M training and 50, 000 validation images, drawn from 1000 classes. |
| Dataset Splits | Yes | The CIFAR-10 and CIFAR-100 datasets ... split into 50, 000 and 10, 000 examples for training and testing. We use the ILSVRC 2012 dataset (Russakovsky et al., 2015) ... It is composed of 1.2M training and 50, 000 validation images, drawn from 1000 classes. |
| Hardware Specification | Yes | We achieve 2.69% test error after training less than 10 hours on a single NVIDIA GTX 1080 Ti. |
| Software Dependencies | No | The paper mentions using Adam optimizer and SGD, but does not provide specific version numbers for any software, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Following Zagoruyko & Komodakis (2016), we train each model for 200 epochs with SGD and Nesterov momentum of 0.9 and a batch size of 128. The learning rate is initially set to 0.1 and decays by a factor of 5 at epochs 60, 120 and 160. We also apply weight decay of 5 10 4 on all parameters except for the coefficients α. ... Each model trains for 50 epochs per phase with Adam (Kingma & Ba, 2015) and a fixed learning rate of 0.01. |