Hermite Polynomial Features for Private Data Generation

Authors: Margarita Vinaroz, Mohammad-Amin Charusaie, Frederik Harder, Kamil Adamczewski, Mi Jung Park

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental As demonstrated on several tabular and image datasets, Hermite polynomial features seem better suited for private data generation than random Fourier features.
Researcher Affiliation Academia 1Max Planck Institute for Intelligent Systems, Tuebingen, Germany 2University of British Columbia, Vancouver, Canada. CIFAR AI Chair at AMII.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/ParkLabML/DP-HP.
Open Datasets Yes The dataset is publicly available at the UCI machine learning repository at the following link: https://archive.ics.uci.edu/ml/datasets/adult. The Census dataset is also a public dataset that can be downloaded via the SDGym package 9. and We follow previous work in testing our method on image datasets MNIST (Le Cun et al., 2010) (license: CC BY-SA 3.0) and Fashion MNIST (Xiao et al., 2017) (license: MIT).
Dataset Splits No The paper does not explicitly provide details about training, validation, and test splits for all datasets. For the Gaussian mixture, it mentions "reserving 10% for the test set, which yields 90000 training samples", but no specific validation split is described for any dataset.
Hardware Specification Yes Our experiments were implemented in Py Torch (Paszke et al., 2019) and run using Nvidia Kepler20 and Kepler80 GPUs.
Software Dependencies Yes Models are taken from the scikit-learn 0.24.2 and xgboost 0.90 python packages
Experiment Setup Yes For the experimental setup of DP-HP on the image datasets see Table 9 in Appendix Sec. H.2. Table 9 lists hyperparameters such as "Hermite order (sum kernel) 100", "gamma 5", "mini-batch size 200", "epochs 10", "learning rate 0.01".