Spectral Representations for Convolutional Neural Networks
Authors: Oren Rippel, Jasper Snoek, Ryan P. Adams
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of this reparametrization on a number of CNN optimization tasks, converging 2-5 times faster than the standard spatial representation. We test spectral pooling on different classification tasks. We run all experiments on code optimized for the Xeon Phi coprocessor. |
| Researcher Affiliation | Collaboration | Oren Rippel Department of Mathematics Massachusetts Institute of Technology rippel@math.mit.edu Jasper Snoek Twitter and Harvard SEAS jsnoek@seas.harvard.edu Ryan P. Adams Twitter and Harvard SEAS rpa@seas.harvard.edu |
| Pseudocode | Yes | Algorithm 1: Spectral pooling Input: Map x RM N, output size H W Output: Pooled map ˆx RH W 1: y F(x) 2: ˆy CROPSPECTRUM(y, H W) 3: ˆy TREATCORNERCASES(ˆy) 4: ˆx F 1(ˆy) Algorithm 2: Spectral pooling back-propagation Input: Gradient w.r.t output Rˆx Output: Gradient w.r.t input R x 1: ˆz F Rˆx 2: ˆz REMOVEREDUNDANCY(ˆz) 3: z PADSPECTRUM(ˆz, M N) 4: z RECOVERMAP(z) 5: R |
| Open Source Code | No | The paper does not provide any explicit statements or links to open-source code for the described methodology. |
| Open Datasets | Yes | We test the information retainment properties of spectral pooling on the validation set of Image Net (Russakovsky et al., 2015). We test spectral pooling on different classification tasks...These settings allow us to attain classification rates of 8.6% on CIFAR-10 and 31.6% on CIFAR-100. |
| Dataset Splits | No | The paper mentions using the |
| Hardware Specification | Yes | We ran all experiments on code optimized for the Xeon Phi coprocessor. |
| Software Dependencies | No | The paper mentions using Spearmint (Snoek et al., 2015) for Bayesian optimization and Adam (Kingma & Ba, 2015) as an optimizer, but it does not specify version numbers for these or any other software components. |
| Experiment Setup | Yes | We hyperparametrize and optimize the following CNN architecture: C96+32m 3 3 SP γHm γHm M m=1 C96+32M 1 1 C10/100 1 1 GA Softmax (5). We perform hyperparameter optimization on the dimensionality decay rate γ [0.25, 0.85], number of layers M {1, . . . , 15}, resolution randomization hyperparameters α, β [0, 0.8], weight decay rate in [10 5, 10 2], momentum in [1 0.10.5, 1 0.12] and initial learning rate in [0.14, 0.1]. We train each model for 150 epochs and anneal the learning rate by a factor of 10 at epochs 100 and 140. |