Dynamic Capacity Networks
Authors: Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are able to drastically reduce the number of computations, compared to traditional convolutional neural networks, while maintaining similar or even better performance. |
| Researcher Affiliation | Collaboration | Amjad Almahairi AMJAD.ALMAHAIRI@UMONTREAL.CA Nicolas Ballas NICOLAS.BALLAS@UMONTREAL.CA Tim Cooijmans TIM.COOIJMANS@UMONTREAL.CA Yin Zheng YIN.ZHENG@HULU.COM Hugo Larochelle HLAROCHELLE@TWITTER.COM Aaron Courville AARON.COURVILLE@UMONTREAL.CA MILA, Universit e de Montr eal, Qu ebec, Canada Hulu LLC. Beijing, China Twitter, Cambridge, MA, USA |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | We focus our empirical evaluation on the Cluttered MNIST and SVHN image datasets." and "We use the 100 100 Cluttered MNIST digit classification dataset (Mnih et al., 2014)." and "transcribing multi-digit sequences from natural images using the Street View House Numbers (SVHN) dataset (Netzer et al., 2011)." |
| Dataset Splits | Yes | The dataset has the same size of MNIST: 60000 images for training and 10000 for testing." (Cluttered MNIST) and "The dataset has three subsets: train (33k), extra (202k) and test (13k). In the following, we trained our models on 230k images from both the train and extra subsets, where we take a 5k random sample as a validation set for choosing hyperparameters." (SVHN) |
| Hardware Specification | Yes | We evaluate all models on an NVIDIA Titan Black GPU card. |
| Software Dependencies | No | We use Batch Normalization (Ioffe & Szegedy, 2015) and Adam (Kingma & Ba, 2014) for training our models." and "The authors would like to thank the developers of Theano (Bergstra et al., 2011; Bastien et al., 2012) and Blocks/Fuel (Van Merri enboer et al., 2015) for developing such powerful tools for scientific computing". Specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | Coarse layers: 2 convolutional layers, with 7 7 and 3 3 filter sizes, 12 and 24 filters, respectively, and a 2 2 stride. ... Fine layers: 5 convolutional layers, each with 3 3 filter sizes, 1 1 strides, and 24 filters. ... We use rectifier non-linearities in all layers. We use Batch Normalization (Ioffe & Szegedy, 2015) and Adam (Kingma & Ba, 2014) for training our models." (Cluttered MNIST) and "While training, we take 54 110 random crop from images, and we use 0.2 dropout on convolutional layers and 0.5 dropout on fully connected layers." (SVHN) |