reproducibilityindex.ai

A Knowledge Transfer Framework for Differentially Private Sparse Learning

Authors: Lingxiao Wang, Quanquan Gu6235-6242

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further demonstrate the superiority of our framework through both synthetic and real-world data experiments.
Researcher Affiliation	Academia	Lingxiao Wang, Quanquan Gu Department of Computer Science, University of California, Los Angeles {lingxw, qgu}@cs.ucla.edu
Pseudocode	Yes	Algorithm 1 Differentially Private Sparse Learning via Knowledge Transfer (DPSL-KT) Algorithm 2 Iterative Gradient Hard Thresholding (IGHT)
Open Source Code	No	No explicit statement or link providing access to the source code for the work described in this paper.
Open Datasets	Yes	For real data experiments, we use E2006-TFIDF dataset (Kogan et al. 2009) and RCV1 dataset (Lewis et al. 2004), for the evaluation of sparse linear regression and sparse logistic regression, respectively.
Dataset Splits	No	No explicit validation set splits (e.g., specific percentages or sample counts for a validation set) are provided in the paper. The paper mentions training and testing examples from the E2006-TFIDF dataset, and then subdivides the original training set into private and public datasets for their framework, using cross-validation for parameter selection.
Hardware Specification	No	No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the text.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) are explicitly listed in the paper.
Experiment Setup	Yes	For all of our experiments, we choose the parameters of different methods according to the requirements of their theoretical guarantees. More speciﬁcally, on the synthetic data experiments, we assume s is known for all the methods. On the real data experiments, s is unknown, neither our method or the competing methods has the knowledge of s . So we simply choose a sufﬁciently large s as a surrogate of s . Given s, for the parameter λ in our method, according to Theorem 4.5, we choose λ from a sequence of values c1 s log d log(1/δ)/(nϵ), where c1 {10 6, 10 5, . . . , 101}, by cross-validation. For competing methods, given s, we choose the iteration number of Frank-Wolfe from a sequence of values c2s, where c2 {0.5, 0.6, . . . , 1.5}, and the regularization parameter in the objective function of Two Stage from a sequence of values c3s/ϵ, where c3 {10 3, 10 2, . . . , 102}, by cross-validation. For DP-IGHT, we choose its stepsize from the grid {1/20, 1/21, . . . , 1/26} by cross-validation. For the non-private baseline, we use the non-private IGHT (Yuan, Li, and Zhang 2014).