Differentially Private Stochastic Convex Optimization under a Quantile Loss Function

Authors: Du Chen, Geoffrey A. Chua

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We run simulations on synthetic datasets to demonstrate our theoretical findings empirically. We examine the excess generalization risk of an estimator bθ relative to the risk of true optimal θ , i.e., L(bθ) L(θ ) L(θ ) , and the relative estimation error bθ θ 2 θ 2 in Figure 4, 5 (exogenous error term) and Figure 6, 7 (endogenous error term). Shadow areas represent standard deviations. For completeness, we report computational time for solving models in Table 2.
Researcher Affiliation Academia Du Chen 1 Geoffrey A. Chua 1 1Nanyang Business School, Nanyang Technological University, 639798, Singapore.
Pseudocode Yes Algorithm 1 DP-Stochastic Gradient Descent (DP-SGD) Input: Private dataset D, privacy parameters ε 1, δ 0, kernel function K with bandwidth h > 0, Lipschitz parameter L = r Bx, smoothness parameter β = KB2 x/h, noise variance σ2 = 8L2 ln (1/δ)/ε2, step size η > 0. Algorithm 2 Objective Perturbation (OP) Input: Private dataset D, privacy parameters ε > 0, δ 0, kernel function K with bandwidth h > 0, Lipschitz parameter L = r Bx, smoothness parameter β = KB2 x/h, variance σ2 = L2 (8 ln (2/δ) + 4ε) /ε2
Open Source Code No The paper does not provide concrete access to source code, such as a specific repository link or an explicit code release statement, for the methodology described in this paper.
Open Datasets No The paper uses "synthetic datasets" with a described data generating process but does not provide concrete access information (link, DOI, repository, formal citation) for a publicly available or open dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It uses synthetic data but doesn't mention how it's split for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes Input: Private dataset D, privacy parameters ε 1, δ 0, kernel function K with bandwidth h > 0, Lipschitz parameter L = r Bx, smoothness parameter β = KB2 x/h, noise variance σ2 = 8L2 ln (1/δ)/ε2, step size η > 0. (from Algorithm 1). Privacy parameters ε is set accordingly, and δ = 10 2. Logistic kernel is used. Simulations are repeatedly run 50 times with gradually increasing sample size n. (from Section 5).