Theoretical Understanding of Learning from Adversarial Perturbations

Authors: Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically verify our theoretical results. Detailed experimental settings and additional results for other norms (L0 and L ) and Gaussian noises can be found in Appendix I. In this section, we validate Theorem 4.4 and Corollary 4.5 using one-hidden-layer neural networks on artificial datasets. The standard dataset D := {(xn, yn)}N n=1 consists of xn and yn sampled from U([ 1, 1]d) and U({ 1}), respectively. Our theorems only require the orthogonality of training samples; thus, using uniform noises as training samples poses no problem.
Researcher Affiliation Academia Soichiro Kumano The University of Tokyo kumano@cvm.t.u-tokyo.ac.jp Hiroshi Kera Chiba University kera@chiba-u.jp Toshihiko Yamasaki The University of Tokyo yamasaki@cvm.t.u-tokyo.ac.jp
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/s-kumano/ learning-from-adversarial-perturbations.
Open Datasets Yes Table 1 shows accuracy in each scenario for MNIST (Deng, 2012), Fashion-MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky, 2009).
Dataset Splits No The paper mentions 'max. validation accuracy' in Figure A19, but it does not specify the validation dataset splits (e.g., percentages or counts) within the main text or experimental setup sections. It only states the datasets used (MNIST, Fashion-MNIST, CIFAR-10) and then later shows validation accuracy plots.
Hardware Specification Yes We used an NVIDIA A100 GPU.
Software Dependencies No The paper mentions software like 'stochastic gradient descent', 'Nesterov momentum', and 'cross-entropy loss' but does not provide specific version numbers for any software, libraries, or frameworks.
Experiment Setup Yes We used one-hidden-layer neural networks and stochastic gradient descent with a learning rate of 0.01, momentum of 0.9, and exponential loss. Considering t , we set the epochs to 100,000. We used a six-layer convolutional neural network for MNIST and Fashion-MNIST and Wide Res Net (Zagoruyko & Komodakis, 2016) for CIFAR-10. The batch size was set to 128. While no data augmentation was applied to MNIST and Fashion-MNIST, random cropping and horizontal flipping were applied to CIFAR-10. We used stochastic gradient descent with Nesterov momentum of 0.9, weight decay of 5 10 4, and cross-entropy loss. The initial learning rates can be found in Table A3. The perturbation constraint ϵ or number of modifiable pixels dδ was set according to Table A4. We set the epochs to 100 for MNIST and 200 for Fashion-MNIST and CIFAR-10.