Scalable Membership Inference Attacks via Quantile Regression
Authors: Martin Bertran, Shuai Tang, Aaron Roth, Michael Kearns, Jamie H. Morgenstern, Steven Z. Wu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show the efficacy of this approach in an extensive series of experiments on various datasets and model architectures. Our code is available at github.com/amazon-science/quantile-mia. 4 Experiments We present two sets of experiments on two different data domains, including images and tabular data. |
| Researcher Affiliation | Collaboration | Martin Bertran Amazon AWS AI/ML Shuai Tang Amazon AWS AI/ML Michael Kearns University of Pennsylvania Amazon AWS AI/ML Jamie Morgenstern University of Washington Amazon AWS AI/ML Aaron Roth University of Pennsylvania Amazon AWS AI/ML Zhiwei Steven Wu Carnegie Mellon University Amazon AWS AI/ML |
| Pseudocode | No | The paper describes its methods in prose and mathematical equations (e.g., Section 3 'Our Attack') but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at github.com/amazon-science/quantile-mia. |
| Open Datasets | Yes | We evaluate the effectiveness of our proposed approach on four image classification datasets: CIFAR10 [Krizhevsky et al., 2009], a standard image classification dataset with 10 target classes; CIFAR-100 [Krizhevsky et al., 2009] another image classification dataset with 100 target classes; Image Net-1k [Russakovsky et al., 2015], a substantially larger image classification task with 1000 target classes; and CINIC-10 [Darlow et al., 2018], an extension of CIFAR-10 that additionally uses images from Image Net-1k corresponding to the original 10 target classes. |
| Dataset Splits | Yes | In all experiments, 50% of the dataset is used for training the target model, and, following the common standards, the resolution of the target model is 32x32 for CIFAR and CINIC datasets, and 224x224 for the Image Net-1k dataset. Since there is a smaller body of literature on stable hyperparameters for regression models, we use Ray Tune [Liaw et al., 2018] for hyperparameter tuning (tuning is used to minimize validation pinball loss in a held out dataset). FPR is computed on a held-out dataset that was not used to train the target or the quantile regression model. |
| Hardware Specification | No | The paper states the compute budget in terms of '30 GPU minutes' and '16 hours' for experiments but does not specify any particular GPU model, CPU model, or other detailed hardware specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'Ray Tune [Liaw et al., 2018]', 'catboost', and 'Optuna [Akiba et al., 2019]' for hyperparameter tuning and model training, but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | To provide a realistic evaluation, we ensure our base models use common, well-performing architectures and follow standard guidelines for hyperparameter selection [He et al., 2015], including data augmentation, learning rate schedule, and l2 regularization(weight decay). Since there is a smaller body of literature on stable hyperparameters for regression models, we use Ray Tune [Liaw et al., 2018] for hyperparameter tuning (tuning is used to minimize validation pinball loss in a held out dataset). Table 4: Summary of hyperparameters optimized for our quantile regressor model on all image experiments. Table 5: Summary of hyperparameters optimized for our quantile regressor model on tabular data. |