reproducibilityindex.ai

CARTL: Cooperative Adversarially-Robust Transfer Learning

Authors: Dian Chen, Hongxin Hu, Qian Wang, Li Yinli, Cong Wang, Chao Shen, Qi Li

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that CARTL improves the inherited robustness by about 28% at most compared with the baseline with the same degree of accuracy. Furthermore, we study the relationship between the batch normalization (BN) layers and the robustness in the context of transfer learning, and we reveal that freezing BN layers can further boost the robustness transfer. We conduct extensive experiments on several transfer learning scenarios and observe that the target model freezing affine parameters of BN layers obtains higher robustness with negligible loss of accuracy.
Researcher Affiliation	Academia	1School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, Hubei, China 2Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY 14260, USA 3Department of Computer Science, City University of Hong Kong, HK SAR, China 4School of Cyber Science and Engineering, Xi an Jiaotong University, Xi an 710049, Shanxi, China 5Institute for Network Sciences and Cyberspace & BNRist, Tsinghua University, Beijing 100084, China. Correspondence to: Qian Wang <qianwang@whu.edu.cn>.
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	For more details about experiment settings, please refer to Appendix C, and our codes are available on Git Hub2. 2https://github.com/NISP-ofﬁcial/CARTL
Open Datasets	Yes	To see the effect of fine-tuning on the robustness and accuracy, we adversarially train a Wide-Res Net (WRN) 34-10 (Zagoruyko & Komodakis, 2017) on CIFAR-100 and a WRN 28-4 on CIFAR-10 as source models, then transfer them to CIFAR-10 and SVHN, respectively.
Dataset Splits	No	No explicit details on specific training/validation/test dataset splits (e.g., percentages, counts, or specific methods like k-fold cross-validation) were found in the provided text. The paper mentions training data and refers to Appendix C for more details.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running the experiments were provided in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library names like PyTorch, TensorFlow, or specific solvers with their versions) were provided in the paper.
Experiment Setup	Yes	The source models are trained with PGD-7, and the perturbation is constrained in an ℓ ball with a radius of ϵ = 8/255. During transferring, we break the source models into blocks and fine-tune them in the unit of blocks (e.g., two layers at once for a WRN block). Then we report the adversarial robustness of the target models against the PGD-100 attack. Here, λ is the hyper-parameter controlling the strength of the FDM penalty term. We also emphasize that different from the naive spectrum normalization, we add a hyper-parameter β (0, 1] for further scaling the Lipschitz constant of the fine-tuned part.