Optimizing Neural Networks with Kronecker-factored Approximate Curvature
Authors: James Martens, Roger Grosse
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To investigate the practical performance of K-FAC we applied it to the 3 deep-autoencoder optimization problems from Hinton and Salakhutdinov (2006), which use the MNIST , CURVES , and FACES datasets respectively |
| Researcher Affiliation | Academia | James Martens JMARTENS@CS.TORONTO.EDU Roger Grosse RGROSSE@CS.TORONTO.EDU Department of Computer Science, University of Toronto |
| Pseudocode | Yes | Algorithm 1 (in Appendix A) shows how to compute the gradient Dθ of the loss function of a neural network using standard backpropagation. ... Finally, Appendix H gives complete high-level pseudocode for K-FAC. |
| Open Source Code | No | The paper describes the implementation details and mentions using 'vectorized MATLAB code accelerated with the GPU package Jacket' but does not state that the code for the methodology is open-source or provide any access link. |
| Open Datasets | Yes | To investigate the practical performance of K-FAC we applied it to the 3 deep-autoencoder optimization problems from Hinton and Salakhutdinov (2006), which use the MNIST , CURVES , and FACES datasets respectively (see Hinton and Salakhutdinov (2006) for a complete description of the network architectures and datasets). |
| Dataset Splits | No | The paper mentions training and test sets, and references 'the prescription given by Sutskever et al. (2013) for determining the learning rate', but it does not specify explicit percentages, sample counts, or detailed methodology for train/validation/test splits. |
| Hardware Specification | Yes | All tests were performed on a single computer with a 4.4 Ghz Intel CPU and an NVidia GTX 580 GPU with 3GB of memory. |
| Software Dependencies | No | Both K-FAC and the baseline were implemented using vectorized MATLAB code accelerated with the GPU package Jacket. The paper does not provide specific version numbers for MATLAB or the Jacket package. |
| Experiment Setup | Yes | In our main experiment we evaluated the performance of our implementation of K-FAC versus the baseline on all 3 deep autoencoder problems, where we used an exponentially increasing schedule for m within K-FAC... and a fixed setting of m within the baseline and momentum-less K-FAC... For each problem we followed the prescription given by Sutskever et al. (2013) for determining the learning rate, and the increasing schedule for the decay parameter µ. |