Theoretical Understanding of Learning from Adversarial Perturbations
Authors: Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically verify our theoretical results. Detailed experimental settings and additional results for other norms (L0 and L ) and Gaussian noises can be found in Appendix I. In this section, we validate Theorem 4.4 and Corollary 4.5 using one-hidden-layer neural networks on artificial datasets. The standard dataset D := {(xn, yn)}N n=1 consists of xn and yn sampled from U([ 1, 1]d) and U({ 1}), respectively. Our theorems only require the orthogonality of training samples; thus, using uniform noises as training samples poses no problem. |
| Researcher Affiliation | Academia | Soichiro Kumano The University of Tokyo kumano@cvm.t.u-tokyo.ac.jp Hiroshi Kera Chiba University kera@chiba-u.jp Toshihiko Yamasaki The University of Tokyo yamasaki@cvm.t.u-tokyo.ac.jp |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/s-kumano/ learning-from-adversarial-perturbations. |
| Open Datasets | Yes | Table 1 shows accuracy in each scenario for MNIST (Deng, 2012), Fashion-MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper mentions 'max. validation accuracy' in Figure A19, but it does not specify the validation dataset splits (e.g., percentages or counts) within the main text or experimental setup sections. It only states the datasets used (MNIST, Fashion-MNIST, CIFAR-10) and then later shows validation accuracy plots. |
| Hardware Specification | Yes | We used an NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions software like 'stochastic gradient descent', 'Nesterov momentum', and 'cross-entropy loss' but does not provide specific version numbers for any software, libraries, or frameworks. |
| Experiment Setup | Yes | We used one-hidden-layer neural networks and stochastic gradient descent with a learning rate of 0.01, momentum of 0.9, and exponential loss. Considering t , we set the epochs to 100,000. We used a six-layer convolutional neural network for MNIST and Fashion-MNIST and Wide Res Net (Zagoruyko & Komodakis, 2016) for CIFAR-10. The batch size was set to 128. While no data augmentation was applied to MNIST and Fashion-MNIST, random cropping and horizontal flipping were applied to CIFAR-10. We used stochastic gradient descent with Nesterov momentum of 0.9, weight decay of 5 10 4, and cross-entropy loss. The initial learning rates can be found in Table A3. The perturbation constraint ϵ or number of modifiable pixels dδ was set according to Table A4. We set the epochs to 100 for MNIST and 200 for Fashion-MNIST and CIFAR-10. |