reproducibilityindex.ai

On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory

Authors: Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre, Jamal Atif6661-6669

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We theoretically study the approximation of this algorithm and show experimentally that it is more efﬁcient and accurate than competing approaches. Finally, we illustrate our approach on adversarial robustness. and Table 2: This table shows the Accuracy under ℓ2 and ℓ attacks of CIFAR10/100 datasets.
Researcher Affiliation	Academia	Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre, Jamal Atif PSL, Universit e Paris-Dauphine, CNRS, LAMSADE, MILES Team, Paris, France alexandre.araujo@dauphine.eu
Pseudocode	Yes	Algorithm 1 Poly Grid 1: input polynomial f, number of samples S 2: output approximated maximum modulus of f 3: σ 0, ω1 0, ϵ 2π/S 4: for i = 0 to S 1 do 5: ω1 ω1 + ϵ, ω2 0 6: for j = 0 to S 1 do 7: ω2 ω2 + ϵ 8: σ max(σ, f(ω1, ω2)) 9: end for 10: end for 11: return σ
Open Source Code	No	The paper mentions 'supplementary material' but does not explicitly state that the source code for the methodology is openly provided or give a link.
Open Datasets	Yes	CIFAR10/100 Dataset For all our experiments, we use the Wide Res Net architecture introduced by Zagoruyko and Komodakis (2016) to train our classiﬁers. and Experimental Settings for Image Net Dataset For all our experiments, we use the Resnet-101 architecture (He et al. 2016).
Dataset Splits	No	The paper mentions training parameters and evaluation on a test set but does not explicitly detail the training/test/validation dataset splits, nor does it mention a dedicated validation set.
Hardware Specification	Yes	The comparison has been made on a Tesla V100 GPU.
Software Dependencies	No	The paper mentions 'Py Torch CUDA proﬁler' which implies PyTorch, but it does not specify version numbers for any software dependencies.
Experiment Setup	Yes	We use Wide Resnet networks with 28 layers and a width factor of 10. We train our networks for 200 epochs with a batch size of 200. We use Stochastic Gradient Descent with a momentum of 0.9, an initial learning rate of 0.1 with exponential decay of 0.1 (Multi Step LR gamma = 0.1) after the epochs 60, 120 and 160. For Adversarial Training (Madry et al. 2018), we use Projected Gradient Descent with an ϵ = 8/255( 0.031), a step size of ϵ/5( 0.0062) and 10 iterations, we use a random initialization but run the attack only once.