Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models

Authors: Nupur Kumari, Mayank Singh, Abhishek Sinha, Harshitha Machiraju, Balaji Krishnamurthy, Vineeth N Balasubramanian

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We analyze the adversarially trained robust models to study their vulnerability against adversarial attacks at the level of the latent layers. Our analysis reveals that contrary to the input layer which is robust to adversarial attack, the latent layer of these robust models are highly susceptible to adversarial perturbations of small magnitude. Leveraging this information, we introduce a new technique Latent Adversarial Training (LAT) which comprises of finetuning the adversarially trained models to ensure the robustness at the feature layers. We also propose Latent Attack (LA), a novel algorithm for constructing adversarial examples. LAT results in a minor improvement in test accuracy and leads to a state-of-the-art adversarial accuracy against the universal first-order adversarial PGD attack which is shown for the MNIST, CIFAR-10, CIFAR-100, SVHN and Restricted Image Net datasets.
Researcher Affiliation Collaboration Nupur Kumari1 , Mayank Singh1 , Abhishek Sinha1 , Harshitha Machiraju2 , Balaji Krishnamurthy1 and Vineeth N Balasubramanian2 1Adobe Inc,Noida 2IIT Hyderabad { nupkumar,msingh,abhsinha,kbalaji } @adobe.com { ee14btech11011,vineethnb } @iith.ac.in
Pseudocode Yes Algorithm 1 describes our LAT training technique. [...] The pseudo code of the proposed algorithm(LA)is given in Algo 2.
Open Source Code Yes Code avaiable at: https://github.com/msingh27/ LAT adversarial robustness
Open Datasets Yes MNIST[Lecun et al., 1989]: We use the network architecture as described in [Madry et al., 2018]. [...] CIFAR-10[Krizhevsky et al., 2010]: We use the network architecture as in [Madry et al., 2018]. [...] CIFAR-100[Krizhevsky et al., 2010]: We use the same network architecture as used for CIFAR-10 [...] SVHN[Netzer et al., 2011]: We use the same network architecture as used for CIFAR-10. [...] Restricted Imagenet[Tsipras et al., 2019]: The dataset consists of a subset of imagenet classes which have been grouped into 9 different classes.
Dataset Splits No The paper mentions using 'test-set' and 'test data' for evaluation, but does not explicitly describe train/validation/test splits, their percentages, or sample counts needed to reproduce the data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only describing the models and datasets.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch, TensorFlow, or specific Python versions).
Experiment Setup Yes Algorithm 1 describes our LAT training technique. Input: Adversarially trained model parameters θ, Subnetwork index which needs to be adversarially trained m, Fine-tuning steps k, Batch size B, Learning rate η, hyperparameter ω [...] We perform 2 epochs of fine-tuning for MNIST, CIFAR-10, Rest. Imagenet, 1 epoch for SVHN and 5 epochs for CIFAR-100 using the different techniques. The results are calculated with the constraint on the maximum amount of per-pixel perturbation as 0.3/1.0 for MNIST dataset and 8.0/255.0 for CIFAR-10, CIFAR-100, Restricted Image Net and SVHN. [...] We fix the value of ω to the best performing value of 0.2 [...] step-size for latent layer αl, step-size for input layer αx, intermediate iteration steps p, global iteration steps k