Gradient Perturbation is Underrated for Differentially Private Convex Optimization
Authors: Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, Tie-Yan Liu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, our extensive experiments suggest that gradient perturbation with the advanced composition method indeed outperforms other perturbation approaches by a large margin, matching our theoretical findings. |
| Researcher Affiliation | Collaboration | 1The School of Data and Computer Science, Sun Yat-sen University. Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou 510006, P.R.China 2Microsoft Research Asia, Beijing, China |
| Pseudocode | Yes | Algorithm 1 DP-GD; Algorithm 2 DP-SGD |
| Open Source Code | No | The paper does not provide any specific links or statements about releasing open-source code for the described methodology. |
| Open Datasets | Yes | We present the results of four benchmark datasets in [Iyengar et al., 2019], including one multi-class dataset (MNIST) and two with high dimensional features (Real-sim, RCV1). Detailed description of datasets can be found in Table 3. |
| Dataset Splits | Yes | We use 80% data for training and the rest for testing, the same as [Iyengar et al., 2019]. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | Running step T is chosen from {50, 200, 800} for both DP-GD and DP-SGD. The standard deviation of the added noise σ is set to be the smallest value such that the privacy budget is allowable to run desired steps. Clipping threshold is set as 1 (0.5 for high dimensional datasets because of the sparse gradient). Privacy parameter δ is set as 1/n^2. The l2 regularization coefficient is set as 1e-4. For DP-GD, learning rate is chosen from {0.1, 1.0, 5.0} ({0.2, 2.0, 10.0} for high dimensional datasets). For DP-SGD, we use moments accountant to track the privacy loss and the sampling ratio is set as 0.1 (roughly the mini-batch size is 0.1 dataset size). The learning rate of DP-SGD is twice as large as DP-GD and it is divided by 2 at the middle of training. All reported numbers are averaged over 20 runs. |