Preconditioned Spectral Descent for Deep Learning
Authors: David E. Carlson, Edo Collins, Ya-Ping Hsieh, Lawrence Carin, Volkan Cevher
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The results are promising in both computational time and quality when applied to Restricted Boltzmann Machines, Feedforward Neural Nets, and Convolutional Neural Nets. ... We empirically validate these ideas by applying them to RBMs, deep belief nets, feedforward neural nets, and convolutional neural nets. We demonstrate major speedups on all models, and demonstrate improved fit for the RBM and the deep belief net. |
| Researcher Affiliation | Academia | 1 Department of Statistics, Columbia University 2 Laboratory for Information and Inference Systems (LIONS), EPFL 3 Department of Electrical and Computer Engineering, Duke University |
| Pseudocode | Yes | Algorithm 1 RMSspectral for RBMs... Algorithm 2 RMSspectral for FNN |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described. |
| Open Datasets | Yes | To show the use of the approximate #-operator from Section 2.4 as well as RMSspec and ADAspec, we first perform experiments on the MNIST dataset. ... For further evidence, we performed the same maximum-likelihood experiment on the Caltech-101 Silhouettes dataset [17]. ... We demonstrate this claim using the well-known MNIST and Cifar-10 [15] image datasets. |
| Dataset Splits | Yes | Both datasets are similar in that they pose a classification task over 10 possible classes. However, CIFAR-10, consisting of 50K RGB images of vehicles and animals, with an additional 10K images reserved for testing, poses a considerably more difficult problem than MNIST, with its 60K greyscale images of hand-written digits, plus 10K test samples. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU model, CPU type) used for running the experiments. It only implies that computational resources were used. |
| Software Dependencies | No | The paper mentions using specific algorithms like RMSprop and ADAgrad, and models like RBMs and Neural Nets, but it does not specify any software frameworks (e.g., TensorFlow, PyTorch) or their version numbers, nor any specific libraries with versions. |
| Experiment Setup | Yes | We detail the algorithmic setting used in these experiments in Supplemental Table 1, which are chosen to match previous literature on the topic. The batch size was chosen to be 1000 data points... Algorithm parameters can be found in Supplemental Table 2. |