On the Intrinsic Differential Privacy of Bagging
Authors: Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate Bagging on MNIST and CIFAR10. Our experimental results demonstrate that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets. |
| Researcher Affiliation | Academia | Hongbin Liu , Jinyuan Jia and Neil Zhenqiang Gong Duke University {hongbin.liu, jinyuan.jia, neil.gong}@duke.edu |
| Pseudocode | No | The paper describes the algorithms conceptually but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using 'open-source implementations of DPSGD and PATE from their authors' but does not provide a statement or link indicating that the code for their Bagging method is open-source. |
| Open Datasets | Yes | We adopt MNIST [Le Cun et al., 2010] and CIFAR10 [Krizhevsky et al., 2009] as the sensitive training datasets... When MNIST is the sensitive training dataset, we adopt Fashion-MNIST [Xiao et al., 2017] as the public non-sensitive dataset... When CIFAR10 is the sensitive training dataset, we assume Image Net [Deng et al., 2009] is the public non-sensitive dataset. |
| Dataset Splits | No | The paper mentions training and testing on datasets like MNIST and CIFAR10, but it does not explicitly specify details for a validation dataset split or how it was used for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using open-source implementations of other methods and VGG16, but it does not specify version numbers for key software components or libraries (e.g., Python, deep learning frameworks). |
| Experiment Setup | Yes | We set training epochs=100 for both Bagging and DPSGD in both Case I and Case II. For PATE [Papernot et al., 2018], we set the number of teachers to be 250, and both teacher and student models are trained for 1,000 epochs... Following the authors of PATE, we set T= 200, σ1 = 150, and σ2 = 40... we replace the last fully connected layer of a pretrained model as a new one that has the same number of classes as the sensitive training dataset. We then fine-tune the model using a learning rate that is 10 times smaller than that when training from scratch. |