Fantastic Four: Differentiable and Efficient Bounds on Singular Values of Convolution Layers

Authors: Sahil Singla, Soheil Feizi

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments on MNIST and CIFAR-10, we demonstrate the effectiveness of our spectral bound in improving generalization and provable robustness of deep networks.
Researcher Affiliation Academia Sahil Singla & Soheil Feizi Department of Computer Science University of Maryland College Park, MD 20740, USA {ssingla,sfeizi}@umd.edu
Pseudocode No The paper does not contain any pseudocode or explicitly labeled algorithm blocks.
Open Source Code Yes Code is available at the github repository: https://github.com/singlasahil14/fantastic-four.
Open Datasets Yes Through experiments on MNIST and CIFAR-10, we demonstrate the effectiveness of our spectral bound in improving generalization and provable robustness of deep networks. ... We use a Resnet-32 neural network architecture and the CIFAR-10 dataset (Krizhevsky, 2009) for training.
Dataset Splits Yes The weight decay coefficient was selected using grid search using 20 values between [0,2 10 3] using a held-out validation set of 5000 samples.
Hardware Specification Yes All experiments were conducted using a single NVIDIA Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions 'tensorflow' but does not specify a version number or other software dependencies with versions.
Experiment Setup Yes For regularization, we use the sum of spectral norms of all layers of the network during training. Thus, our regularized objective function is given as follows1: min θ E(x,y) [ℓ(fθ(x),y)] + β I u(I) (1) where β is the regularization coefficient, (x,y) s are the input-label pairs in the training data, u(I) denotes the bound for the Ith convolution or fully connected layer. ... The weight decay coefficient was selected using grid search using 20 values between [0,2 10 3] using a held-out validation set of 5000 samples. ... We use a 2 layer convolutional neural network with the tanh (Dugas et al., 2000) activation function and 5 filters in the convolution layer.