Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

When Does Data Augmentation Help With Membership Inference Attacks?

Authors: Yigitcan Kaya, Tudor Dumitras

ICML 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate 7 mechanisms and differential privacy, on three image classification tasks. We use three datasets for evaluation: Fashion MNIST, CIFAR-10 and CIFAR-100.
Researcher Affiliation Academia 1University of Maryland, Maryland, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes For reproducibility and future research, we also release our source code at https://github.com/ yigitcankaya/augmentation_mia.
Open Datasets Yes We use three datasets for evaluation: Fashion MNIST (Xiao et al., 2017), CIFAR-10 and CIFAR100 (Krizhevsky et al., 2009).
Dataset Splits Yes The Fashion-MNIST consists of ... 60,000 training and 10,000 validation images. The CIFAR-10 and CIFAR-100 consist of ... 50,000 training and 10,000 validation images.
Hardware Specification No The paper mentions training 'modern convolutional neural networks' and 'simple variants of VGG' but does not specify any hardware details like CPU, GPU models, or cloud computing instances used for the experiments.
Software Dependencies No The paper mentions using 'ADAM optimizer (Reddi et al., 2019)' but does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We train our models for 35 epochs using the ADAM optimizer (Reddi et al., 2019). We set the L2 weight decay coefficient to 10 6 and the batch size to 128.