Learning by Stretching Deep Networks
Authors: Gaurav Pandey, Ambedkar Dukkipati
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results suggest that the proposed stretched deep convolutional networks are capable of achieving good performance for many object recognition tasks. More importantly, for a fixed network architecture, one can achieve much better accuracy using stretching rather than learning the weights using backpropagation. 6. Experimental results |
| Researcher Affiliation | Academia | Gaurav Pandey GP88@CSA.IISC.ERNET.IN Ambedkar Dukkipati AD@CSA.IISC.ERNET.IN Department of Computer Science and Automation Indian Institute of Science, Bangalore-560012, India |
| Pseudocode | Yes | Algorithm 1 Iterative computation of the convolved kernel matrix |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for the described methodology. |
| Open Datasets | Yes | MNIST (Le Cun et al., 1998) is a standard dataset for character recognition with 50000 training and 10000 test samples of digits ranging from 0 to 9. The Caltech-101 dataset (Fei-Fei et al., 2007) consists of pictures of objects belonging to 101 categories with about 40 to 800 objects per categories. STL-10 dataset (Coates et al., 2011) is an image recognition dataset with 10 classes and 500 training and 800 test images per class. |
| Dataset Splits | No | For MNIST, the paper mentions '50000 training and 10000 test samples' but no explicit validation split. For Caltech-101 and STL-10, it similarly only specifies training and testing samples without mentioning a validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | The weight matrix used for the stretched and the unstretched model is the same and is obtained after training the model for 5 epochs ( 150s). We randomly extract 9 9 patches from the images and learn a weight matrix with 64 weight vectors from these patches using an Re Lu RBM. The weight matrix is then stretched by multiplication with a random matrix of size 64 64. We use average pooling with a 10 10 boxcar filter and 5 5 down-sampling. |