reproducibilityindex.ai

The Privacy Onion Effect: Memorization is Relative

Authors: Nicholas Carlini, Matthew Jagielski, Chiyuan Zhang, Nicolas Papernot, Andreas Terzis, Florian Tramer

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform several experiments to study this effect, and understand why it occurs. The existence of this effect has various consequences. We empirically corroborate this Privacy Onion Effect on standard neural network models trained on the CIFAR-10 and CIFAR-100 image classiﬁcation datasets [19]. For example, we ﬁnd that if we remove the 5,000 training samples that are most at risk from membership inference, in the absence of any other effects we should mathematically expect this removal to improve the overall privacy by a factor of 15 , but in reality it only improves privacy by a factor of 2 . That is, the Privacy Onion Effect has caused this removal to be over 6 less effective than expected.
Researcher Affiliation	Industry	Nicholas Carlini Matthew Jagielski Nicolas Papernot Andreas Terzis Florian Tramer Chiyuan Zhang Google Research
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using 'an efﬁcient open-source training pipeline [20]' and 'the open source image deduplication library imagededup [17]' but does not provide access to its own source code for the methodology described.
Open Datasets	Yes	We empirically corroborate this Privacy Onion Effect on standard neural network models trained on the CIFAR-10 and CIFAR-100 image classiﬁcation datasets [19].
Dataset Splits	No	The paper does not explicitly provide details about training/validation/test splits, only mentioning training datasets and a process of removing examples from the training set.
Hardware Specification	Yes	We use an efﬁcient open-source training pipeline [20] to train each model in just 16 GPU-seconds (we train on 16 A100 GPUs for a total of 1000 GPU-hours).
Software Dependencies	No	The paper mentions software like 'FFCV' and 'imagededup' but does not provide specific version numbers for these dependencies.
Experiment Setup	No	The paper describes its experimental methodology in three steps but does not provide specific hyperparameters (e.g., learning rate, batch size, optimizer) or detailed training configurations for the models.