Transfer Learning In Differential Privacy’s Hybrid-Model

Authors: Refael Kohen, Or Sheffet

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5 we pose suggestions as to how to reduce this bound as open problems, leveraging on the fact that the worst-case bounds for either the number of iterations T or the density parameter κ may be drastically reduced for specific hypothesis classes / instances. We give two particular examples of such instances: one proven rigorously (for the case of PARITY under the uniform distribution) and one based on empirical evaluations of our (non private version of the) transfer-learning technique in a two high-dimensional Gaussian settings, in which our algorithm makes far fewer iterations than our O(α−2) worst-case upper-bound.
Researcher Affiliation Academia 1Faculty of Engineering, Bar-Ilan University, Israel.
Pseudocode Yes Algorithm 1 Non-Private Subsample-Test-Reweigh, Algorithm 2 Private Subsample-Test-Reweigh, Algorithm 3 Online SGD, Algorithm 4 Private Subsample-Test-Reweigh with SGD
Open Source Code No The paper does not provide a specific link or explicit statement about the release of its source code for the described methodology.
Open Datasets No The paper defines synthetic distributions (e.g., 'S = N( 0, Id) whereas for T we picked an arbitrary set of k = 10 coordinates and set the standard deviation on these k as σ = 0.02') from which data is sampled, rather than utilizing or providing access to a publicly available, fixed dataset.
Dataset Splits No The paper describes generating samples from specified distributions (S and T) for training and evaluation. It does not provide explicit training, validation, or test dataset splits (e.g., percentage-based or absolute counts) for a fixed dataset, as it simulates data rather than using a pre-collected one.
Hardware Specification No The paper vaguely mentions 'on a desktop computer' in Appendix C, but provides no specific details such as CPU/GPU models, memory, or other hardware components used for experiments.
Software Dependencies No The paper mentions using 'SVM' and 'online SGD' as algorithms, but it does not specify any software libraries or frameworks with their version numbers (e.g., 'PyTorch 1.9' or 'scikit-learn 0.24').
Experiment Setup Yes We applied the non-private version of our algorithm (Algorithm 1). In the non-private version, in order to learn in each iteration a hyperplane separator over a subsample of examples from S we used SVM, where our optimization goal is 1/2 ||w||^2 + C sum_x max{0, 1 - y_i w, x_i} with a very large C = 10^30 (aiming to find an exact hyperplane as possible). [...] We set α = 0.08, and κ = α / (8(χ^2+1)), and used the privacy parameters of ϵ = 0.5 and δ = 0.0001.