Auditing Privacy Mechanisms via Label Inference Attacks

Authors: Róbert Busa-Fekete, Travis Dick, Claudio Gentile, Andres Munoz Medina, Adam Smith, Marika Swanberg

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a series of experiments on benchmark and synthetic datasets measuring the privacy-utility tradeoff of a number of basic mechanisms
Researcher Affiliation Collaboration Róbert István Busa-Fekete Google Research NY busarobi@google.com Travis Dick Google Research NY tdick@google.com Claudio Gentile Google Research NY cgentile@google.com Andrés Muñoz Medina Google Research NY ammedina@google.com Adam Smith Boston University & Google Deep Mind ads22@bu.edu Marika Swanberg Boston University & Google Research NY marikas@google.com
Pseudocode No The paper describes mechanisms and algorithms (e.g., Randomized Response, LLP, PROPMATCH) but does not present any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Reproducibility. For the sake of full reproducibility of our experimental setting and results, our code is available at the link https://github.com/google-research/google-research/ tree/master/auditing_privacy_via_lia.
Open Datasets Yes We use the click prediction data from the KDD Cup 2012, Track 2 [3]... We also use the Higgs dataset [4]...
Dataset Splits No The paper does not explicitly specify a validation dataset split percentage or sample count. It mentions 'For each dataset, PET, and privacy parameters, we perform a grid search over the learning rate parameter and report the test AUC of the best performing learning rate.' This implies hyperparameter tuning, which typically uses a validation set, but the specific split for validation is not stated.
Hardware Specification Yes We conduct our experiments on a cluster of virtual machines each equipped with a p100 GPU, 16 core CPU, and 16GB of memory.
Software Dependencies No The paper mentions 'minibatch gradient descent with the Adam optimizer [24]' and discusses 'the scikit-learn package' in the NeurIPS checklist answer, but it does not provide specific version numbers for any software components or libraries required for reproducibility.
Experiment Setup Yes For every PET and every value of their privacy parameters, we train the model with each learning rate in {10-6, 5*10-6, 10-5, 10-4, 5*10-4, 10-3, 5*10-3, 10-2}... When training a model on the output of any PET, we always use minibatch gradient descent together with the Adam optimizer [24].