Privacy Profiles for Private Selection
Authors: Antti Koskela, Rachel Emily Redberg, Yu-Xiang Wang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerically, our approach improves over the RDP-based accounting in all regimes of interest and leads to substantial benefits in end-to-end private learning experiments. Our general result also allows analysing the case of binomially-distributed number of rounds, which leads to more concentrated distributions compared to the previously considered Poisson distribution. |
| Researcher Affiliation | Collaboration | 1Nokia Bell Labs 2Northeastern University 3UC San Diego. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code or a link to a repository for the methodology described. |
| Open Datasets | Yes | We apply the Generalized PTR to the one-posterior sample (OPS) algorithm described in (Redberg et al., 2023) which includes privately releasing the L2-norm of the non-private solution and also the smallest eigenvalue of the feature covariance matrix. The parameter to tune in the method is the regularization strength λ (see Alg. 7, Redberg et al., 2023) and we carry out a random search on a pre-defined logarithmically equidistant grid meaning that we pick a random value from the grid at each of the K rounds. Notice that we could draw the candidates from any fixed probability distribution; the only requirement is that each candidate mechanism has the same privacy profile. As baselines we have the same approach using the privacy bounds of Liu & Talwar (Thm. 3.5, 2019), the output perturbation method (Chaudhuri et al., 2011) and the non-adaptive method OPS-Balanced by Wang (2018). UCI Bike dataset (n = 17379, d = 17) UCI Elevator dataset (n = 8752, d = 18) |
| Dataset Splits | No | The paper mentions using 'UCI Bike dataset' and 'UCI Elevator dataset' and discusses 'training machine learning models', but does not specify any training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions the 'Opacus library (Yousefpour et al., 2021)' but does not provide a specific version number for it or other software dependencies necessary for replication. |
| Experiment Setup | Yes | We fix q = 0.01 and set as a threshold ϵQ = 1.5 and δ = 10^-6. We consider three σ-candidates: 2.0, 3.0 and 4.0 and for each of them the number of iterations T is determined to be the maximum such that the privacy profile of the candidate is below (ϵ1, δ1)-and (bϵ, δ/m)-thresholds. As a result we can run the candidate models for 4000, 10000 and 18000 iterations, respectively. |