On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data
Authors: Di Wang, Hanshen Xiao, Srinivas Devadas, Jinhui Xu
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments suggest that our algorithms can effectively deal with the challenges caused by data irregularity.Finally, we test our proposed aglorithms on both synthetic and real-world datasets. Experimental results are consistent with our theoretical claims and reveal the effectiveness of our algorithms in handling heavy-tailed datasets. |
| Researcher Affiliation | Academia | Di Wang * 1 2 Hanshen Xiao * 3 Srini Devadas 3 Jinhui Xu 1 1Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 2King Abdullah University of Science and Technology, Thuwal, Saudi Arabia 3CSAIL, MIT, Cambridge, MA. |
| Pseudocode | Yes | Algorithm 1 Sample-aggregate Framework (Nissim et al., 2007) Algorithm 2 Mechanism M in (Bun & Steinke, 2019) Algorithm 3 Heavy-tailed DP-SCO with known mean Algorithm 4 Heavy-tailed DP-SCO with known variance |
| Open Source Code | Yes | Due to the space limit, some definitions, all the proofs are relegated to the appendix in the Supplementary Material, which also includes the codes of experiments. |
| Open Datasets | Yes | For real-world data, we use the Adult dataset from the UCI Repository (Dua & Graff, 2017). |
| Dataset Splits | No | The paper only specifies a training and testing split ("28,000 amongst which are used as the training set and the rest are used for test") but does not explicitly mention a validation set or how it would be used for hyperparameter tuning or early stopping. |
| Hardware Specification | No | The paper does not provide specific hardware details (such as exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | For the privacy parameters, we will choose ϵ = {0.1, 0.5, 1} and δ = O( 1 /n). See Appendix for the selections of other parameters. |