Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark
Authors: Mengxi Ya, Yiming Li, Tao Dai, Bin Wang, Yong Jiang, Shu-Tao Xia
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark datasets are conducted, verifying the effectiveness of our generalization-limited watermark. |
| Researcher Affiliation | Collaboration | Mengxi Ya1 , Yiming Li2,3, , , Tao Dai4, Bin Wang1,5, Yong Jiang1, Shu-Tao Xia1 1Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China 2The State Key Laboratory of Blockchain and Data Security, Zhejiang University, China 3ZJU-Hangzhou Global Scientific and Technological Innovation Center, China 4College of Computer Science and Software Engineering, Shenzhen University, China 5Guangzhou Intelligence Communications Technology Co.,Ltd., China |
| Pseudocode | Yes | Algorithm 1 Generating the penalized trigger. Algorithm 2 Watermarking DNNs with BWTP. Algorithm 3 Generating the desired synthesized trigger. Algorithm 4 Watermarking DNNs with GLBW. |
| Open Source Code | Yes | Our codes are available at https://github.com/yamengxi/GLBW. |
| Open Datasets | Yes | We adopt Res Net-18 (He et al., 2016) on CIFAR-10 (Krizhevsky, 2009) and GTSRB (Stallkamp et al., 2012) datasets for our discussions. |
| Dataset Splits | No | The paper mentions using a "benign training set" and "100 testing images for evaluation" (for SRV methods), but it does not specify any explicit dataset splits for training, validation, and testing of the models themselves (e.g., percentages or counts for each split). |
| Hardware Specification | Yes | We train each model with one NVIDIA RTX3090 GPU. |
| Software Dependencies | No | The paper mentions using "Python toolbox Backdoor Box" and the "SGD optimizer" but does not specify version numbers for Python, deep learning frameworks (like PyTorch or TensorFlow), or other key software libraries used in the experiments. |
| Experiment Setup | Yes | Specifically, the target label is set to 0 and the watermarking rate is set to 5%. ... we simply set λ1 = λ2 = 1 for BWTP and λ3 = λ4 = 1 for our GLBW. ...the mini-batch is set to 128 for each iteration. We use the SGD optimizer by setting momentum=0.9 and weight decay=5e-4. For CIFAR-10, the number of the training epochs is 200 and the learning rate is initialized as 0.1 and multiplied by 0.1 at the 150 and 180 epoch, respectively. For GTSRB, the number of the training epochs is 30 and the learning rate is initialized as 0.01 and multiplied by 0.1 at the 20th epoch. |