Adversarial Attack and Defense for Non-Parametric Two-Sample Tests

Authors: Xilie Xu, Jingfeng Zhang, Feng Liu, Masashi Sugiyama, Mohan Kankanhalli

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both simulated and real-world datasets validate the adversarial vulnerabilities of non-parametric TSTs and the effectiveness of our proposed defense.
Researcher Affiliation Academia 1School of Computing, National University of Singapore 2RIKEN Center for Advanced Intelligence Project (AIP) 3School of Mathematics and Statistics, The University of Melbourne 4Graduate School of Frontier Sciences, The University of Tokyo.
Pseudocode Yes Algorithm 1 Ensemble Attack (EA) Algorithm 2 Adversarially Learning Deep Kernels Algorithm 3 Testing with kθ on SP and SQ
Open Source Code Yes Source code is available at https://github.com/God Xuxilie/Robust-TST.git.
Open Datasets Yes We conduct six typical non-parametric TSTs (MMD-D, MMD-G, C2ST-S, C2ST-L, ME and SCF) under EA on five benchmark datasets Blob (Gretton et al., 2012; Jitkrittum et al., 2016; Sutherland et al., 2017), high-dimensional Gaussian mixture (HDGM) (Liu et al., 2020a), Higgs (Chwialkowski et al., 2015), MNIST (Le Cun et al., 1998; Radford et al., 2015) and CIFAR-10 (Krizhevsky, 2009). ... Higgs dataset can be downloaded from UCI Machine Learning Repository. ... CIFAR10 dataset can be downloaded via Py Torch (Paszke et al., 2019).
Dataset Splits No The paper explicitly defines training data (e.g., "For Blob, HDGM and Higgs, we randomly sample a training pair (Str P , Str Q )...") and test data ("we randomly sample 100 new pairs (Ste P , Ste Q ), disjoint from the training data, as the benign test pairs."). However, it does not explicitly mention a separate validation set or a specific validation split used for hyperparameter tuning or model selection during the training process.
Hardware Specification Yes We conduct all experiments on Python 3.8 (Py Torch 1.1) with NVIDIA RTX A50000 GPUs.
Software Dependencies Yes We conduct all experiments on Python 3.8 (Py Torch 1.1) with NVIDIA RTX A50000 GPUs.
Experiment Setup Yes The training settings (e.g., the structure of neural network and the optimizer) follow Liu et al. (2020a) and are illustrated in detail in Appendix E.2. ... We use Adam optimizer (Kingma & Ba, 2015)... We set drop-out rate to zero... We set the number of training samples ntr to 100 for Blob, 3, 000 for HDGM, 5, 000 for Higgs, 500 for MNIST and CIFAR-10. ... For C2ST-S and C2ST-L, we set batchsize to 128 for Blob, HDGM and Higgs, and 100 for MNIST and CIFAR-10. We set the number of training epochs to 9000 nte/batchsize for Blob, 1, 000 for HDGM and Higgs, 2, 000 for MNIST and CIFAR-10. We set learning rate to 0.001 for Blob, HDGM and Higgs, and 0.0002 for MNIST and CIFAR-10.