MExMI: Pool-based Active Model Extraction Crossover Membership Inference
Authors: Yaxin Xiao, Qingqing Ye, Haibo Hu, Huadi Zheng, Chengfang Fang, Jie Shi
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that MEx MI can improve up to 11.14% from the best known PAME attack and reach 94.07% fidelity with only 16k queries. |
| Researcher Affiliation | Collaboration | Yaxin Xiao The Hong Kong Polytechnic University 20034165r@connect.polyu.hk Qingqing Ye The Hong Kong Polytechnic University qqing.ye@polyu.edu.hk Haibo Hu The Hong Kong Polytechnic University haibo.hu@polyu.edu.hk Huadi Zheng The Hong Kong Polytechnic University huadi.zheng@connect.polyu.hk Chengfang Fang Huawei International, Singapore fang.chengfang@huawei.com Jie Shi Huawei International, Singapore shi.jie1@huawei.com |
| Pseudocode | Yes | As illustrated in Fig. 1 and pseudocode in Appendix A, the input of MEx MI is an adversary data pool P and the access to a black-box victim model F, and its outputs are the copy model F and the inferred training dataset ˆD. |
| Open Source Code | Yes | The codes are available at https://github.com/mexmi/mexmi-project. |
| Open Datasets | Yes | Datasets. We perform PAME attacks on two image datasets, namely CIFAR10 [28] and Street View House Number (SVHN) [32], and a text dataset AG’S NEWS which contains corpus of AG’s news articles [11] (see Appendix E.1 for details). |
| Dataset Splits | No | The paper mentions training and test datasets (e.g., 'CIFAR10 and AG’S NEWS test sets') but does not specify explicit validation dataset splits or details for reproducibility. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA GeForce RTX 3090, except for those run on Model Arts. |
| Software Dependencies | No | The paper mentions using frameworks like Model Arts and specific models (e.g., Wide-Res Net-28-10, DPCNN, VGG16) and optimizers (Adam), but it does not specify version numbers for key software dependencies such as Python, PyTorch, or TensorFlow libraries. |
| Experiment Setup | Yes | Recall that the hyper-parameters (such as epoch, initial learning rate, and optimizer) of shadow models Fs can be adjusted to maximize the metric Q. Parameter a in Q is set as 0.05 and f( ) is set as log10( ).3 The preset weights ratio ω in MI Post-Filter is 5 : 1. For CIFAR10 experiments, MEx MI queries 2k samples in each round with a total of 8 rounds. For AG’S NEWS experiments, there are 6 rounds, each with 5k samples. |