reproducibilityindex.ai

Structured Transforms for Small-Footprint Deep Learning

Authors: Vikas Sindhwani, Tara Sainath, Sanjiv Kumar

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that these transforms can significantly accelerate inference and forward/backward passes during training, and offer superior accuracy-compactness-speed tradeoffs in comparison to a number of existing techniques.
Researcher Affiliation	Industry	Vikas Sindhwani Tara N. Sainath Sanjiv Kumar Google, New York {sindhwani, tsainath, sanjivk}@google.com
Pseudocode	Yes	Theorem 3.3 (Fast Multiplication). Given an n b matrix X, the matrix-matrix product, Y = (Pr i=1 Z1(gi)Z 1(hi)) X, can be computed at the cost of 2(rb + b + r) FFTs, using the following algorithm. Set η = [1, η, η2 . . . ηn 1]T where η = ( 1) 1 n = exp(i π n) Initialize Y = 0n b Set X = ﬀt(diag(η)X) Set G = ﬀt(G) = [ g1 . . . gr] and H = ﬀt(diag(η)H) = [ h1 . . . hr] for i = 1 to r U = Z 1(hi)X = diag( η)iﬀt diag( hi) X V = diag( gi) ﬀt(U) Y = Y + V Set Y = iﬀt (Y) Return Y
Open Source Code	No	The paper cites supplementary material at http://vikas.sindhwani.org/st_supplementary.pdf, which is a PDF, and references a third-party library FFTW, but does not provide an explicit link or statement about releasing its own source code.
Open Datasets	Yes	MNIST is the original 10-class MNIST digit classification dataset with 60000 training examples and 10000 test examples. We refer the reader to [23] for more details about the datasets. (Reference [23] is: T. Sainath and C. Parada. Convolutional neural networks for small-footprint keyword spotting. In Proc. Interspeech, 2015.)
Dataset Splits	Yes	The utterances were randomly split into training, development and evaluation sets in the ratio of 80 : 5 : 15.
Hardware Specification	Yes	6-core 32-GB Intel(R) Xeon(R) machine; random datasets.
Software Dependencies	No	The paper mentions "FFT implementations (we use FFTW: http://www.fftw.org)" but does not specify a version number for FFTW or any other software components.
Experiment Setup	Yes	The global learning rate is set to 0.002, while our structured transform layers use a layer-specific learning rate of 0.0005; both are decayed by an exponential factor of 0.1.