A Knowledge Transfer Framework for Differentially Private Sparse Learning
Authors: Lingxiao Wang, Quanquan Gu6235-6242
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further demonstrate the superiority of our framework through both synthetic and real-world data experiments. |
| Researcher Affiliation | Academia | Lingxiao Wang, Quanquan Gu Department of Computer Science, University of California, Los Angeles {lingxw, qgu}@cs.ucla.edu |
| Pseudocode | Yes | Algorithm 1 Differentially Private Sparse Learning via Knowledge Transfer (DPSL-KT) Algorithm 2 Iterative Gradient Hard Thresholding (IGHT) |
| Open Source Code | No | No explicit statement or link providing access to the source code for the work described in this paper. |
| Open Datasets | Yes | For real data experiments, we use E2006-TFIDF dataset (Kogan et al. 2009) and RCV1 dataset (Lewis et al. 2004), for the evaluation of sparse linear regression and sparse logistic regression, respectively. |
| Dataset Splits | No | No explicit validation set splits (e.g., specific percentages or sample counts for a validation set) are provided in the paper. The paper mentions training and testing examples from the E2006-TFIDF dataset, and then subdivides the original training set into private and public datasets for their framework, using cross-validation for parameter selection. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the text. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) are explicitly listed in the paper. |
| Experiment Setup | Yes | For all of our experiments, we choose the parameters of different methods according to the requirements of their theoretical guarantees. More specifically, on the synthetic data experiments, we assume s is known for all the methods. On the real data experiments, s is unknown, neither our method or the competing methods has the knowledge of s . So we simply choose a sufficiently large s as a surrogate of s . Given s, for the parameter λ in our method, according to Theorem 4.5, we choose λ from a sequence of values c1 s log d log(1/δ)/(nϵ), where c1 {10 6, 10 5, . . . , 101}, by cross-validation. For competing methods, given s, we choose the iteration number of Frank-Wolfe from a sequence of values c2s, where c2 {0.5, 0.6, . . . , 1.5}, and the regularization parameter in the objective function of Two Stage from a sequence of values c3s/ϵ, where c3 {10 3, 10 2, . . . , 102}, by cross-validation. For DP-IGHT, we choose its stepsize from the grid {1/20, 1/21, . . . , 1/26} by cross-validation. For the non-private baseline, we use the non-private IGHT (Yuan, Li, and Zhang 2014). |