DP-HyPO: An Adaptive Private Framework for Hyperparameter Optimization

Authors: Hua Wang, Sheng Gao, Huanyu Zhang, Weijie Su, Milan Shen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we empirically demonstrate the effectiveness of DP-Hy PO on a diverse set of real-world datasets.
Researcher Affiliation Collaboration Hua Wang Department of Statistics and Data Science University of Pennsylvania Philadelphia, PA 19104 wanghua@wharton.upenn.edu Sheng Gao Department of Statistics and Data Science University of Pennsylvania Philadelphia, PA 19104 shenggao@wharton.upenn.edu Huanyu Zhang Meta Platforms, Inc. New York, NY 10003 huanyuzhang@meta.com Weijie J. Su Department of Statistics and Data Science University of Pennsylvania Philadelphia, PA 19104 suw@wharton.upenn.edu Milan Shen Meta Platforms, Inc. Menlo Park, CA 94025 milanshen@gmail.com
Pseudocode Yes Framework 1 DP-Hy PO A(D, π(0), T , C, c) Initialize π(0), a prior distribution over Λ. Initialize the result set A = {} Draw T T for j = 0 to T 1 do (x, q) Q(D, π(j)) A = A {(x, q)} Update π(j+1) based on A according to any adaptive algorithm such that for all λ Λ, c π(j+1)(λ) end for Output (x, q) from A with the highest q
Open Source Code No The paper does not provide a direct link to its own open-source code or explicitly state that its code is being released.
Open Datasets Yes We consider the MNIST dataset, where we employ DP-SGD to train a standard CNN. ... We conduct experiments on the CIFAR-10 dataset, with a setup closely mirroring the previous experiment
Dataset Splits No The paper discusses training models and evaluating performance (accuracy) but does not explicitly provide details about specific train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification No The paper mentions 'constraints on computational resources' and that 'a single run on CIFAR-10 is considerably more time-consuming than on MNIST', but it does not specify any particular hardware (e.g., GPU, CPU models, or cloud configurations) used for the experiments.
Software Dependencies No The paper mentions software like 'Bo Torch [3]', 'CVXOPT [2]', and 'Tensorflow Privacy and Opacus' but does not specify version numbers for these dependencies.
Experiment Setup Yes We set the training batch size to be 256, and the total number of epoch to be 10. The value of σ is determined based on the allocated ε budget for each base algorithm. Specifically, σ = 0.71 for GP and σ = 0.64 for Uniform. For demonstration purposes, we set C to 2 and c to 0.75 in the GP method, so each base algorithm of Uniform has log C/c more privacy budget than base algorithms in GP method. In Algorithm 3, we set τ to 0.1 and β to 1. To facilitate the implementation of both methods, we discretize the learning rates and clipping norms as specified in the following setting to allow simple implementation of sampling and projection for Uniform and GP methods. Setting E.1. we set a log-spaced grid discretization on η in the range [0.0001, 10] with a multiplicative factor of 3 10, resulting in 16 observations for η. We also set a linear-spaced grid discretization on R in the range [0.3, 6] with an increment of 0.3, resulting in 20 observations for R. This gives a total of 320 hyperparameters over the search region.