Do Perceptually Aligned Gradients Imply Robustness?
Authors: Roy Ganz, Bahjat Kawar, Michael Elad
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on multiple datasets and architectures validate that models with aligned gradients exhibit significant robustness, exposing the surprising bidirectional connection between PAG and robustness. Lastly, we show that better gradient alignment leads to increased robustness and harness this observation to boost the robustness of existing adversarial training techniques. |
| Researcher Affiliation | Academia | 1Electrical Engineering Department, Technion, Haifa, Israel 2Computer Science Department, Technion, Haifa, Israel. |
| Pseudocode | No | The paper describes its methods using prose and mathematical equations (e.g., Equation (4), Equation (5), Equation (1)) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https: //github.com/royg27/PAG-ROB. |
| Open Datasets | Yes | We experiment with models from different architecture families Convolutional Neural Networks and Vision Transformers (Vi T) (Dosovitskiy et al., 2021), and multiple datasets CIFAR-10, STL, and CIFAR-100. |
| Dataset Splits | Yes | To evaluate performance, we generate a balanced test set from the same distribution consisting of 600 samples. ... STL... has 5,000 training and 8,000 test images. |
| Hardware Specification | Yes | We use a single Tesla V100 GPU. ... We use two NVIDIA RTX A4000 16GB GPUs for each experiment. |
| Software Dependencies | No | The paper mentions several GitHub repositories for implementations and tools used (e.g., improved-diffusion, MLP-Mixer-CIFAR, pytorch-vgg-cifar10, auto-attack, TRADES) but does not provide specific version numbers for software dependencies like PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | We do so for 100 epochs with a batch size of 128, using Adam optimizer, a learning rate of 0.01, and the same seed for both training processes. ... For all the tested datasets, we train the classifier (Res Net-18 or Vi T) for 100 epochs, using SGD with a learning rate of 0.01, a momentum of 0.9, and a weight decay of 0.0001. In addition, we use the standard augmentations for these datasets random cropping with padding of 4 and random horizontal flipping with a probability of 0.5. We use a batch size of 64 for CIFAR-10 and CIFAR-100 and 32 for STL. |