Nearly Optimal Private LASSO

Authors: Kunal Talwar, Abhradeep Guha Thakurta, Li Zhang

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present a nearly optimal differentially private version of the well known LASSO estimator. Our algorithm provides privacy protection with respect to each training example. The excess risk of our algorithm, compared to the non-private version, is e O(1/n2/3), assuming all the input data has bounded ℓ norm. This is the first differentially private algorithm that achieves such a bound without the polynomial dependence on p under no additional assumptions on the design matrix. In addition, we show that this error bound is nearly optimal amongst all differentially private algorithms.
Researcher Affiliation Industry Kunal Talwar Google Research kunal@google.com Abhradeep Thakurta (Previously) Yahoo! Labs guhathakurta.abhradeep@gmail.com Li Zhang Google Research liqzhang@google.com
Pseudocode Yes Algorithm 1 Frank-Wolfe algorithm Input: C Rp, L : C R, µt 1: Choose an arbitrary θ1 from C 2: for t = 1 to T 1 do 3: Compute eθt = argminθ C L(θt), (θ θt) 4: Set θt+1 = θt + µt(eθt θt) 5: return θT . Algorithm 2 ANoise FW(polytope): Differentially Private Frank-Wolfe Algorithm (Polytope Case) Input: Data set: D = {d1, , dn}, loss function: L(θ; D) = 1 n i=1 L(θ; di) (with ℓ1-Lipschitz constant L1 for L), privacy parameters: (ϵ, δ), convex set: C = conv(S) with C 1 denoting maxs S s 1. 1: Choose an arbitrary θ1 from C 2: for t = 1 to T 1 do 3: s S, αs s, L(θt; D) + Lap L1 C 1 8T log(1/δ) nϵ , where Lap(λ) 1 2λe |x|/λ. 4: eθt arg min s S αs. 5: θt+1 (1 µt)θt + µteθt, where µt = 2 t+2. 6: Output θpriv = θT .
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No The paper is theoretical and does not conduct experiments on specific datasets, therefore it does not provide access information for a public dataset for training.
Dataset Splits No The paper is theoretical and does not conduct experiments on specific datasets, therefore it does not provide dataset split information for validation.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not specify any software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe experimental setup details like hyperparameters or training configurations.