XRand: Differentially Private Defense against Explanation-Guided Attacks

Authors: Truc Nguyen, Phung Lai, Hai Phan, My T. Thai

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental evaluation of our solution is given in Section 6. To evaluate the performance of our defense, we conduct the XBA proposed by (Severi et al. 2021) against the explanations returned by XRAND to create backdoors on cloud-hosted malware classifiers. Our experiments aim to shed light on understanding (1) the effectiveness of the defense in mitigating the XBA, and (2) the faithfulness of the explanations returned by XRAND.
Researcher Affiliation Academia 1 University of Florida, Gainesville, FL 32611 2 New Jersey Institute of Technology, Newark, NJ 07102
Pseudocode Yes Algorithm 1: XRAND: Explanation-guided RR mechanism
Open Source Code No The paper does not explicitly state that source code for the described methodology is released, nor does it provide a direct link to a code repository. It mentions 'All technical appendices can be accessed via the full version of our paper (Nguyen et al. 2022)' which links to an arXiv preprint of the paper itself, not source code.
Open Datasets Yes The experiments are conducted using Light GBM (Anderson and Roth 2018) and Ember NN (Severi et al. 2021) classification models that are trained on the EMBER (Anderson and Roth 2018) dataset.
Dataset Splits No The paper mentions training data and test samples, but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages, counts, or predefined split methods). It refers to 'A detailed description of the experimental settings can be found in Appx. E,' but this appendix is not provided in the given text.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or cloud instance specifications.
Software Dependencies No The paper mentions using 'Light GBM' and 'Ember NN' classification models but does not provide specific version numbers for these or any other software dependencies required to reproduce the experiments.
Experiment Setup Yes We set k to be equal to the trigger size of the attack, and fix the predefined threshold τ = 50. ... At a 1% poison rate, our defense manages to reduce the attack success rate from 77.8% to 10.2% with ε = 1.0 for Light GBM.