reproducibilityindex.ai

The Singular Values of Convolutional Layers

Authors: Hanie Sedghi, Vineet Gupta, Philip M. Long

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We characterize the singular values of the linear transformation associated with a standard 2D multi-channel convolutional layer, enabling their eﬃcient computation. This characterization also leads to an algorithm for projecting a convolutional layer onto an operator-norm ball. We show that this is an eﬀective regularizer; for example, it improves the test error of a deep residual network using batch normalization on CIFAR-10 from 6.2% to 5.3%. Timing tests, reported in Section 4.1, conﬁrm that this characterization speeds up the computation of singular values by multiple orders of magnitude making it usable in practice.
Researcher Affiliation	Industry	Hanie Sedghi, Vineet Gupta and Philip M. Long Google Brain Mountain View, CA 94043
Pseudocode	Yes	def SingularValues(kernel, input_shape): transforms = np.fft.fft2(kernel, input_shape, axes=[0, 1]) return np.linalg.svd(transforms, compute_uv=False) in the introduction, and a full Python function def Clip_Operator_Norm(...) in Appendix A.
Open Source Code	No	The paper provides code snippets and mentions having their own TensorFlow and NumPy implementations, but does not include an explicit statement about releasing the code for the described methodology or a direct link to a code repository.
Open Datasets	Yes	it improves the test error of a deep residual network using batch normalization on CIFAR-10 from 6.2% to 5.3%.
Dataset Splits	No	The paper uses CIFAR-10 and discusses training parameters and learning rate schedules but does not explicitly provide training/validation/test dataset splits or mention a specific validation set size/ratio.
Hardware Specification	No	The paper mentions that the TensorFlow implementation runs 'much faster on a GPU' and that 'clipping norms by our method on a GPU was about 25% faster', but it does not specify the model or type of GPU used.
Software Dependencies	No	The paper mentions the use of 'Num Py' and 'Tensor Flow' implementations but does not provide specific version numbers for these software components.
Experiment Setup	Yes	This network reached a test error rate of 6.2% after 250 epochs, using a learning rate schedule determined by a grid search... We then evaluated an algorithm that, every 100 steps, clipped the norms of the convolutional layers to various diﬀerent values between 0.1 and 3.0. We tried all combinations of the following hyperparameters: (a) the norm of the ball projected onto (no projection, 0.5, 1.0, 1.5, 2.0); (b) the initial learning rate (0.001, 0.003, 0.01, 0.03, 0.1); (c) the minibatch size (32, 64); (d) the number of epochs per decay of the learning rate (1,2,3).