Improving Gradient Regularization using Complex-Valued Neural Networks
Authors: Eric C Yeats, Yiran Chen, Hai Li
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | All experiments are conducted using the Py Torch library (Paszke et al., 2019). We evaluate the training characteristics, white-box adversarial robustness, and black-box robustness of gradient regularized CVNN. We compare these results with those of real-valued NN trained with gradient regularization or adversarial training. We consider the additional parameter and MAC requirement of complex numbers over real numbers. |
| Researcher Affiliation | Academia | Eric Yeats 1 Yiran Chen 1 Hai Li 1 1ECE Dept., Duke University, Durham, North Carolina, USA. Correspondence to: Eric Yeats <eric.yeats@duke.edu>. |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Code is available at: https: //github.com/ericyeats/cvnn-security. |
| Open Datasets | Yes | White-box attacks are crafted against the networks on four popular image classification benchmark tasks: MNIST (Le Cun, 1998), Fashion MNIST (Xiao et al., 2017), SVHN (Netzer et al., 2011), and CIFAR-10 (Krizhevsky et al., 2009). |
| Dataset Splits | No | The MNIST, Fashion MNIST, and CIFAR-10 benchmarks consist of 50000 training images and 10000 test images. The SVHN benchmark consists of 73,257 training images and 26,032 test images. No explicit mention of a separate validation dataset split percentage, size, or methodology was provided, although "validation loss" is mentioned in Figure 4. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud computing instances used for running experiments. |
| Software Dependencies | No | The paper mentions using "Py Torch library (Paszke et al., 2019)" but does not specify a version number for PyTorch or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | The standard objective for all networks is cross-entropy loss, optimized using SGD with Nesterov momentum of µ = 0.875 and weight decay of l2 = 10 4. Networks for MNIST, SVHN, and Fashion MNIST are trained for 30 epochs with a minibatch size of 64 and an initial learning rate of ν = 0.005, and networks for CIFAR-10 are trained for 80 epochs with a minibatch size of 128 and an initial learning rate of ν = 0.01. Learning rate is decayed by γ = 0.2 after 20 epochs for MNIST, SVHN, and Fashion MNIST networks, and epochs 40, 60, and 72 for CIFAR-10 networks. |