Fantastic Four: Differentiable and Efficient Bounds on Singular Values of Convolution Layers
Authors: Sahil Singla, Soheil Feizi
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments on MNIST and CIFAR-10, we demonstrate the effectiveness of our spectral bound in improving generalization and provable robustness of deep networks. |
| Researcher Affiliation | Academia | Sahil Singla & Soheil Feizi Department of Computer Science University of Maryland College Park, MD 20740, USA {ssingla,sfeizi}@umd.edu |
| Pseudocode | No | The paper does not contain any pseudocode or explicitly labeled algorithm blocks. |
| Open Source Code | Yes | Code is available at the github repository: https://github.com/singlasahil14/fantastic-four. |
| Open Datasets | Yes | Through experiments on MNIST and CIFAR-10, we demonstrate the effectiveness of our spectral bound in improving generalization and provable robustness of deep networks. ... We use a Resnet-32 neural network architecture and the CIFAR-10 dataset (Krizhevsky, 2009) for training. |
| Dataset Splits | Yes | The weight decay coefficient was selected using grid search using 20 values between [0,2 10 3] using a held-out validation set of 5000 samples. |
| Hardware Specification | Yes | All experiments were conducted using a single NVIDIA Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions 'tensorflow' but does not specify a version number or other software dependencies with versions. |
| Experiment Setup | Yes | For regularization, we use the sum of spectral norms of all layers of the network during training. Thus, our regularized objective function is given as follows1: min θ E(x,y) [ℓ(fθ(x),y)] + β I u(I) (1) where β is the regularization coefficient, (x,y) s are the input-label pairs in the training data, u(I) denotes the bound for the Ith convolution or fully connected layer. ... The weight decay coefficient was selected using grid search using 20 values between [0,2 10 3] using a held-out validation set of 5000 samples. ... We use a 2 layer convolutional neural network with the tanh (Dugas et al., 2000) activation function and 5 filters in the convolution layer. |