Clustering Effect of Adversarial Robust Models

Authors: Yang Bai, Xin Yan, Yong Jiang, Shu-Tao Xia, Yisen Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental evaluations demonstrate the rationality and superiority of our proposed clustering strategy.
Researcher Affiliation Collaboration 1Tsinghua Berkeley Shenzhen Institute, Tsinghua University 2Tsinghua Shenzhen International Graduate School, Tsinghua University 3Key Lab. of Machine Perception, School of Artificial Intelligence, Peking University 4Institute for Artificial Intelligence, Peking University 5PCL Research Center of Networks and Communications, Peng Cheng Laboratory, China
Pseudocode Yes Algorithm 1 Extracting the Linear Weight Matrix W; Algorithm 2 Enhancing the Hierarchical Clustering Effect
Open Source Code Yes Our code is available at https://github.com/bymavis/Adv_Weight_Neur IPS2021.
Open Datasets Yes Datasets and Setups Beside Fig. 1, we show more clustering effect across data sets using STL-10 [8] and two reconstructed data sets CIFAR-20 and Image Net-10, composed of 20/10 subclasses from 4/2 superclasses in CIFAR-100 and Tiny Image Net [27].
Dataset Splits Yes randomly divide a hierarchical data set into source data and target data according to their subclasses. That is, source and target data share the same superclass yet different subclasses, dubbed subpopulation shift. In the training phase, we use fine labels of source data and map fine labels to coarse ones. In such a fashion, the model returns fine and coarse labels.
Hardware Specification Yes The AT model costs 191.20s per epoch, while the AT+C model costs 193.66s, using one GPU 1080X with batch size 128 of Res Net-18 on CIFAR-10.
Software Dependencies No The paper mentions software components and models (e.g., Res Net-18, Dense Net-121, Alex Net, PGD, SGD, Auto Attack) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For robust models, we adversarially train Res Net-18 [15] using PGD-10 ( = 8/255 and step size 2/255) with random start. The non-robust models are standard trained. Both models are trained for 200 epochs using SGD with momentum 0.9, weight decay 2e-4, and initial learning rate 0.1 which is divided by 10 at the 75-th and 90-th epoch.