The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection
Authors: Shubhankar Mohapatra, Sajin Sasy, Xi He, Gautam Kamath, Om Thakkar7806-7813
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a comprehensive empirical evaluation of the proposed theoretical method... We empirically and theoretically demonstrate... We empirically show that the DPAdam optimizer... The new optimizer is compared with DPAdam and ADADP. For brevity, we show experiments on σ = 4 and others appear in the full version (Mohapatra et al. 2021). In Figure 5, we show the maximum and median accuracy curves for all the optimizers. |
| Researcher Affiliation | Collaboration | 1University of Waterloo 2Google |
| Pseudocode | No | The proof for Theorem 3 and the pseudo-code for DPAdam WOSM are provided in our full version (Mohapatra et al. 2021). |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the methodology described. |
| Open Datasets | Yes | We repeat the same experiment over the ENRON dataset and observe similar trends (Figures 2(c) and 2(d)). ... we evaluate this private optimizer over four diverse datasets and two learning models including logistic regression and a neural network with one 100 neurons hidden layer (TLNN). |
| Dataset Splits | No | The dataset has been partitioned into the training set and the validation set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as CPU/GPU models or memory specifications. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed for reproducibility. |
| Experiment Setup | Yes | The grids for each optimizer are shown in Table 1, where DPSGD has 40 candidates to tune over and DPAdam has 4 with fixed α = 0.001, β1 = 0.9, β2 = 0.999. ... For each dataset and model, we run DPAdam three times with hyperparameters (α, β1, β2) from the grids, α [0.001, 0.05, 0.01, 0.2, 0.5], β1, β2 [0.8, 0.85, 0.9, 0.95, 0.99.0.999]. ... we fix a constant lot size (L=250), and consider tuning over three different noise levels, σ [2, 4, 8], ... we also fix the clipping threshold C=0.5, and T=2500 iterations of training. |