Amplifying Membership Exposure via Data Poisoning
Authors: Yufei Chen, Chao Shen, Yun Shen, Cong Wang, Yang Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively evaluate our attacks on computer vision benchmarks. Our results show that the proposed attacks can substantially increase the membership inference precision with minimum overall test-time model performance degradation. |
| Researcher Affiliation | Collaboration | Yufei Chen1,2 Chao Shen1 Yun Shen3 Cong Wang2 Yang Zhang4 1Xi an Jiaotong University 2City University of Hong Kong 3Net App 4CISPA Helmholtz Center for Information Security |
| Pseudocode | No | The paper describes its methods in text and mathematical formulas but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code is available at https://github.com/yfchen1994/poisoning_membership. |
| Open Datasets | Yes | We adopt five datasets for our experiments, including (1) MNIST [1] that contains 60,000 handwritten digits from 0 to 9. (2) CIFAR-10 [2] that contains 60,000 images from 10 classes. (3) STL-10 [3] that contains 13,000 labeled images from 10 classes. (4) Celeb A [25] that contains 202,599 face images annotated by 40 attributes. (5) Patch Camelyon [46] that contains 327,680 images to predict the presence of metastatic tissue. |
| Dataset Splits | Yes | In our experiment, we split each dataset into three portions: the clean training dataset Dclean containing the members, the testing dataset Dtest containing the non-members, and the shadow dataset Dshadow for generating poisoning samples. We follow the same setup with [38], where |Dclean| = |Dtest| = |Dshadow|, and each of them does not overlap with others. Additionally, we set the three datasets to be balanced for the simplicity of evaluation among each class. |
| Hardware Specification | Yes | Our experiments were conducted on a deep learning server, which is equipped with an Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz, 128GB RAM, and four NVIDIA Ge Force RTX 3090 GPUs with 24GB of memory. |
| Software Dependencies | Yes | We use five pretrained models provided by Tensorflow(v2.5.2): Xception, Res Net18, Mobile Netv2, Inception V3, and VGG16. |
| Experiment Setup | Yes | We fix the feature extractor and train the newly added FC layers with the Adam optimizer, with the learning rate of 10 3 and batch size of 100. ... The hyperparameters used in our implementation are summarized in Table 4. LEARNING RATE 0.001 for Inception V3; 0.01 for others NOISE MULTIPLIER 1.0 MAX L2-NORM OF GRADIENTS 1.0 BATCH SIZE 100 MICRO BATCH SIZE 100 EPOCHS 20 |