On the Functional Similarity of Robust and Non-Robust Neural Representations

Authors: András Balogh, Márk Jelasity

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we investigate the functional similarity of robust and non-robust representations for image classification with the help of model stitching. We find that robust and non-robust networks indeed have different representations. In order to answer our research question, we performed an empirical study of stitching adversarially robust models and non-robust models in a number of different scenarios, for the image classification task
Researcher Affiliation Academia Andr as Balogh 1 M ark Jelasity 1 2 1University of Szeged, Hungary 2ELKH SZTE Research Group on Artificial Intelligence, Szeged, Hungary. Correspondence to: Andr as Balogh <abalogh@inf.u-szeged.hu>, M ark Jelasity <jelasity@inf.u-szeged.hu>.
Pseudocode No The paper describes methods using mathematical equations and textual explanations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code to reproduce our results can be found at https://gi thub.com/szegedai/robust-stitching
Open Datasets Yes Throughout the paper, we present experimental results over the CIFAR-10 dataset (Krizhevsky, 2009) using a number of pre-trained Res Net-18 (He et al., 2016b) networks, and each experiment is repeated three times independently. For additional datasets and architectures please refer to Appendix B. We experiment with two robust networks, both available from Robust-Bench (Croce et al., 2021): f R1 denotes the Res Net-18 network of (Sehwag et al., 2022) and f R2 is the network of (Addepalli et al., 2021). Our experiments were conducted using a number of models trained on the following datasets: MNIST (Le Cun et al., 2010), Fashion MNIST (Xiao et al., 2017), CIFAR-10 (Krizhevsky, 2009) and SVHN (Netzer et al., 2011).
Dataset Splits No The paper refers to using a 'test set' for evaluation (Section A.4) and training on datasets like CIFAR-10, but it does not explicitly provide details about specific training/validation/test splits (e.g., percentages, sample counts) or how a validation set was used in their experimental setup. While standard datasets often have predefined splits, the paper does not state its specific methodology for partitioning data for training, validation, and testing beyond implicitly using a test set.
Hardware Specification Yes For our experiments we used a mixture of Ge Force 2080 Ti 10G, 3060 12G, and V100DX-16C (10G virtual slice) GPUs.
Software Dependencies No The paper mentions using Py Torch for datasets (Appendix A.1) but does not provide specific version numbers for Py Torch or any other software libraries, frameworks, or solvers used in their experiments.
Experiment Setup Yes Our hyperparameter settings for model training closely followed those of (Csisz arik et al., 2021). All of our non-robust models were trained minimizing the cross-entropy loss function with an ℓ2 weight decay coefficient of 10 4 using the stochastic gradient descent (SGD) optimizer with a Nesterov momentum of 0.9. For better generalization we used the following augmentation techniques: random horizontal flip, random crop and in the case of MNIST random affine transformations with scaling in the range of [0.9, 1.1] and at most 5 degrees of rotation and shearing. We set the initial learning rate 10 3 when training VGG-11 models and 10 1 for every other architecture. During training, we divided the learning rate by 10 at 1/3 and 2/3 of the total number of training epochs. We trained the MNIST models for 30 epochs and the CIFAR-10, SVHN and Fashion MNIST models for 200 epochs. We trained our robust models with the ℓ threat model with the standard setting ϵ = 0.3 for the MNIST and Fashion MNIST models and ϵ = 8/255 for the CIFAR-10 and SVHN models. For the internal maximization in adversarial training, when using the strategy of (Madry et al., 2018; Goodfellow et al., 2015), we used the untargeted projected gradient descent (PGD) attack as described by (Madry et al., 2018) with 10 iterations and with a step size parameter of 0.1 for MNIST and Fashion MNIST models and 2/255 for CIFAR-10 and SVHN models.