Improving Gradient Regularization using Complex-Valued Neural Networks

Authors: Eric C Yeats, Yiran Chen, Hai Li

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental All experiments are conducted using the Py Torch library (Paszke et al., 2019). We evaluate the training characteristics, white-box adversarial robustness, and black-box robustness of gradient regularized CVNN. We compare these results with those of real-valued NN trained with gradient regularization or adversarial training. We consider the additional parameter and MAC requirement of complex numbers over real numbers.
Researcher Affiliation Academia Eric Yeats 1 Yiran Chen 1 Hai Li 1 1ECE Dept., Duke University, Durham, North Carolina, USA. Correspondence to: Eric Yeats <eric.yeats@duke.edu>.
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Code is available at: https: //github.com/ericyeats/cvnn-security.
Open Datasets Yes White-box attacks are crafted against the networks on four popular image classification benchmark tasks: MNIST (Le Cun, 1998), Fashion MNIST (Xiao et al., 2017), SVHN (Netzer et al., 2011), and CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits No The MNIST, Fashion MNIST, and CIFAR-10 benchmarks consist of 50000 training images and 10000 test images. The SVHN benchmark consists of 73,257 training images and 26,032 test images. No explicit mention of a separate validation dataset split percentage, size, or methodology was provided, although "validation loss" is mentioned in Figure 4.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud computing instances used for running experiments.
Software Dependencies No The paper mentions using "Py Torch library (Paszke et al., 2019)" but does not specify a version number for PyTorch or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes The standard objective for all networks is cross-entropy loss, optimized using SGD with Nesterov momentum of µ = 0.875 and weight decay of l2 = 10 4. Networks for MNIST, SVHN, and Fashion MNIST are trained for 30 epochs with a minibatch size of 64 and an initial learning rate of ν = 0.005, and networks for CIFAR-10 are trained for 80 epochs with a minibatch size of 128 and an initial learning rate of ν = 0.01. Learning rate is decayed by γ = 0.2 after 20 epochs for MNIST, SVHN, and Fashion MNIST networks, and epochs 40, 60, and 72 for CIFAR-10 networks.