XRand: Differentially Private Defense against Explanation-Guided Attacks
Authors: Truc Nguyen, Phung Lai, Hai Phan, My T. Thai
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluation of our solution is given in Section 6. To evaluate the performance of our defense, we conduct the XBA proposed by (Severi et al. 2021) against the explanations returned by XRAND to create backdoors on cloud-hosted malware classifiers. Our experiments aim to shed light on understanding (1) the effectiveness of the defense in mitigating the XBA, and (2) the faithfulness of the explanations returned by XRAND. |
| Researcher Affiliation | Academia | 1 University of Florida, Gainesville, FL 32611 2 New Jersey Institute of Technology, Newark, NJ 07102 |
| Pseudocode | Yes | Algorithm 1: XRAND: Explanation-guided RR mechanism |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is released, nor does it provide a direct link to a code repository. It mentions 'All technical appendices can be accessed via the full version of our paper (Nguyen et al. 2022)' which links to an arXiv preprint of the paper itself, not source code. |
| Open Datasets | Yes | The experiments are conducted using Light GBM (Anderson and Roth 2018) and Ember NN (Severi et al. 2021) classification models that are trained on the EMBER (Anderson and Roth 2018) dataset. |
| Dataset Splits | No | The paper mentions training data and test samples, but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages, counts, or predefined split methods). It refers to 'A detailed description of the experimental settings can be found in Appx. E,' but this appendix is not provided in the given text. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions using 'Light GBM' and 'Ember NN' classification models but does not provide specific version numbers for these or any other software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | We set k to be equal to the trigger size of the attack, and fix the predefined threshold τ = 50. ... At a 1% poison rate, our defense manages to reduce the attack success rate from 77.8% to 10.2% with ε = 1.0 for Light GBM. |