Transfer Learning In Differential Privacy’s Hybrid-Model
Authors: Refael Kohen, Or Sheffet
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5 we pose suggestions as to how to reduce this bound as open problems, leveraging on the fact that the worst-case bounds for either the number of iterations T or the density parameter κ may be drastically reduced for specific hypothesis classes / instances. We give two particular examples of such instances: one proven rigorously (for the case of PARITY under the uniform distribution) and one based on empirical evaluations of our (non private version of the) transfer-learning technique in a two high-dimensional Gaussian settings, in which our algorithm makes far fewer iterations than our O(α−2) worst-case upper-bound. |
| Researcher Affiliation | Academia | 1Faculty of Engineering, Bar-Ilan University, Israel. |
| Pseudocode | Yes | Algorithm 1 Non-Private Subsample-Test-Reweigh, Algorithm 2 Private Subsample-Test-Reweigh, Algorithm 3 Online SGD, Algorithm 4 Private Subsample-Test-Reweigh with SGD |
| Open Source Code | No | The paper does not provide a specific link or explicit statement about the release of its source code for the described methodology. |
| Open Datasets | No | The paper defines synthetic distributions (e.g., 'S = N( 0, Id) whereas for T we picked an arbitrary set of k = 10 coordinates and set the standard deviation on these k as σ = 0.02') from which data is sampled, rather than utilizing or providing access to a publicly available, fixed dataset. |
| Dataset Splits | No | The paper describes generating samples from specified distributions (S and T) for training and evaluation. It does not provide explicit training, validation, or test dataset splits (e.g., percentage-based or absolute counts) for a fixed dataset, as it simulates data rather than using a pre-collected one. |
| Hardware Specification | No | The paper vaguely mentions 'on a desktop computer' in Appendix C, but provides no specific details such as CPU/GPU models, memory, or other hardware components used for experiments. |
| Software Dependencies | No | The paper mentions using 'SVM' and 'online SGD' as algorithms, but it does not specify any software libraries or frameworks with their version numbers (e.g., 'PyTorch 1.9' or 'scikit-learn 0.24'). |
| Experiment Setup | Yes | We applied the non-private version of our algorithm (Algorithm 1). In the non-private version, in order to learn in each iteration a hyperplane separator over a subsample of examples from S we used SVM, where our optimization goal is 1/2 ||w||^2 + C sum_x max{0, 1 - y_i w, x_i} with a very large C = 10^30 (aiming to find an exact hyperplane as possible). [...] We set α = 0.08, and κ = α / (8(χ^2+1)), and used the privacy parameters of ϵ = 0.5 and δ = 0.0001. |