Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks

Authors: Qiyang Li, Saminul Haque, Cem Anil, James Lucas, Roger B. Grosse, Joern-Henrik Jacobsen

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we find that it is competitive with existing approaches to provable adversarial robustness and Wasserstein distance estimation. We evaluate our GNP networks in two situations where expressive Lipschitz-constrained networks are of central importance. The first is provable norm-bounded adversarial robustness... The other application is Wasserstein distance estimation. 5 Experiments
Researcher Affiliation Academia Qiyang Li , Saminul Haque , Cem Anil, James Lucas, Roger Grosse, Jörn-Henrik Jacobsen University of Toronto, Vector Institute {qiyang.li, saminul.haque, cem.anil}@mail.utoronto.ca {jlucas, rgrosse}@cs.toronto.edu j.jacobsen@vectorinstitute.ai
Pseudocode Yes Algorithm 1: Block Convolution Orthogonal Parameterization (BCOP)
Open Source Code Yes 2Code is available at: github.com/Colin Qiyang Li/LConv Net
Open Datasets Yes The first task is provably robust image classification tasks on two datasets (MNIST [31] and CIFAR-10 [30])4. The second task is 1-Wasserstein distance estimation... trained on RGB images from the STL-10 dataset [13] and CIFAR-10 dataset [30]
Dataset Splits No No specific training/validation/test split percentages or sample counts are explicitly provided, nor is a dedicated validation set mentioned. The paper relies on standard datasets (MNIST, CIFAR-10) which have predefined splits, but these are not explicitly detailed for reproduction.
Hardware Specification No No specific hardware details (GPU models, CPU models, or memory specifications) used for running experiments are provided in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., library versions, framework versions) are provided in the paper.
Experiment Setup Yes Unless specified otherwise, each experiment is repeated 5 times with mean and standard deviation reported. For OSSN, we use 10 power iterations... For SVCM, we perform the singular value clipping projection with 50 iterations after every 100 gradient updates... To achieve large output margins, we use first-order, multi-class, hinge loss with a margin of 2.12 on MNIST and 0.7071 on CIFAR-10.