Differentially Private Stochastic Convex Optimization under a Quantile Loss Function
Authors: Du Chen, Geoffrey A. Chua
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run simulations on synthetic datasets to demonstrate our theoretical findings empirically. We examine the excess generalization risk of an estimator bθ relative to the risk of true optimal θ , i.e., L(bθ) L(θ ) L(θ ) , and the relative estimation error bθ θ 2 θ 2 in Figure 4, 5 (exogenous error term) and Figure 6, 7 (endogenous error term). Shadow areas represent standard deviations. For completeness, we report computational time for solving models in Table 2. |
| Researcher Affiliation | Academia | Du Chen 1 Geoffrey A. Chua 1 1Nanyang Business School, Nanyang Technological University, 639798, Singapore. |
| Pseudocode | Yes | Algorithm 1 DP-Stochastic Gradient Descent (DP-SGD) Input: Private dataset D, privacy parameters ε 1, δ 0, kernel function K with bandwidth h > 0, Lipschitz parameter L = r Bx, smoothness parameter β = KB2 x/h, noise variance σ2 = 8L2 ln (1/δ)/ε2, step size η > 0. Algorithm 2 Objective Perturbation (OP) Input: Private dataset D, privacy parameters ε > 0, δ 0, kernel function K with bandwidth h > 0, Lipschitz parameter L = r Bx, smoothness parameter β = KB2 x/h, variance σ2 = L2 (8 ln (2/δ) + 4ε) /ε2 |
| Open Source Code | No | The paper does not provide concrete access to source code, such as a specific repository link or an explicit code release statement, for the methodology described in this paper. |
| Open Datasets | No | The paper uses "synthetic datasets" with a described data generating process but does not provide concrete access information (link, DOI, repository, formal citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It uses synthetic data but doesn't mention how it's split for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | Input: Private dataset D, privacy parameters ε 1, δ 0, kernel function K with bandwidth h > 0, Lipschitz parameter L = r Bx, smoothness parameter β = KB2 x/h, noise variance σ2 = 8L2 ln (1/δ)/ε2, step size η > 0. (from Algorithm 1). Privacy parameters ε is set accordingly, and δ = 10 2. Logistic kernel is used. Simulations are repeatedly run 50 times with gradually increasing sample size n. (from Section 5). |