Noise-Aware Differentially Private Regression via Meta-Learning

Authors: Ossi Räisä, Stratis Markou, Matthew Ashman, Wessel Bruinsma, Marlon Tobaben, Antti Honkela, Richard Turner

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on synthetic and a sim-to-real task with real data. We provide the exact experimental details in Appendix E.
Researcher Affiliation Collaboration Ossi Räisä University of Helsinki ossi.raisa@helsinki.fi Stratis Markou University of Cambridge em626@cam.ac.uk Matthew Ashman University of Cambridge mca39@cam.ac.uk Wessel P. Bruinsma Microsoft Research AI for Science wessel.p.bruinsma@gmail.com Marlon Tobaben University of Helsinki marlon.tobaben@helsinki.fi Antti Honkela University of Helsinki antti.honkela@helsinki.fi Richard E. Turner University of Cambridge ret26@cam.ac.uk
Pseudocode Yes Algorithm 1 Meta-training a neural process. Algorithm 2 Meta-testing a neural process. Algorithm 3 DPSet Conv; modifications to the original Set Conv layer shown in blue. Algorithm 4 Efficient sampling of GP noise on a D-dimensional grid.
Open Source Code Yes We make our implementation of the DPConv CNP public in the repository https://github.com/cambridge-mlg/dpconvcnp.
Open Datasets Yes We evaluated the performance of the DPConv CNP in a sim-to-real task, where we train the model on simulated data and test it on the the Dobe !Kung dataset [Howell, 2009], also used by Smith et al. [2018], containing age, weight and height measurements of 544 individuals. The Dobe !Kung dataset is publicly available in Tensor Flow 2 [Abadi et al., 2016], specifically the Tensorflow Datasets package.
Dataset Splits Yes Throughout optimisation, we maintain a fixed set of 2,048 tasks generated in the same way, as a validation set.
Hardware Specification Yes We train the DPConv CNP on a single NVIDIA Ge Force RTX 2080 Ti GPU, on a machine with 20 CPU workers.
Software Dependencies No For all our experiments with the DPConv CNP we use Adam with a learning rate of 3 10 4, setting all other options to the default Tensor Flow 2 settings. We use Optuna [Akiba et al., 2019] to perform the BO, and Opacus [Yousefpour et al., 2021] to perform DP-SGD using the PRV privacy accountant.
Experiment Setup Yes For all our experiments with the DPConv CNP we use Adam with a learning rate of 3 10 4, setting all other options to the default Tensor Flow 2 settings. For the DPConv CNP we use 6,553,600 such tasks with a batch size of 16 at training time, which is equivalent to 409,600 gradient update steps. For all our experiments, we initialise the DPSet Conv and Set Conv lengthscales (which are also used to sample the DP noise) to λ = 0.20, and allow this parameter to be optimised during training.