Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Membership Privacy for Machine Learning Models Through Knowledge Transfer

Authors: Virat Shejwalkar, Amir Houmansadr9549-9557

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive evaluation shows that DMP provides significantly better tradeoffs between membership privacy and classiﬁcation accuracies compared to state-of-the-art MIA defenses. For instance, DMP achieves 100% accuracy improvement over adversarial regularization for Dense Net trained on CIFAR100, for similar membership privacy (measured using MIA risk): when the MIA risk is 53.7%, adversarially regularized Dense Net is 33.6% accurate, while DMPtrained Dense Net is 65.3% accurate. We have released our code at github.com/vrt1shjwlkr/AAAI21-MIA-Defense. Experimental Setup Datasets And Target Model Architectures We use four datasets and corresponding model architectures...
Researcher Affiliation	Academia	Virat Shejwalkar and Amir Houmansadr University of Massachusetts Amherst EMAIL
Pseudocode	No	The paper does not contain any explicit pseudocode or algorithm blocks. It describes the DMP technique using text and a diagram (Figure 1).
Open Source Code	Yes	We have released our code at github.com/vrt1shjwlkr/AAAI21-MIA-Defense.
Open Datasets	Yes	Purchase (Purchase 2017) is a 100 class classiﬁcation task... Texas (Texas 2017) is dataset of patient records. CIFAR10 and CIFAR100 are popular image classiﬁcation datasets... (citations: Purchase. 2017. Acquire Valued Shoppers Challenge. https://www.kaggle.com/c/acquire-valued-shopperschallenge/data., Texas. 2017. Texas hospital stays dataset. https://www. dshs.texas.gov/THCIC/Hospitals/Download.shtm.)
Dataset Splits	Yes	Sizes Of Dataset Splits. The dataset splits are given in Table 1. For Purchase and Texas tasks, we use Dref of size 10k and select Xref of size 10k from the remaining data using our entropy-based criterion. For CIFAR datasets, we use Dref of size 25k and due to small sizes of these datasets, use the entire remaining 25k data as Xref. The Attack training (described shortly) column shows the MIA adversary s knowledge of members and non-members of Dtr. Following all the previous works, we assume that the adversary knows 50% of Dtr. (Table 1 provides specific \|Dtr\|, \|Xref\|, \|D\|, \|D'\| sizes).
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies	No	The paper mentions using a 'standard SGD optimizer, e.g., Adam', but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	Hyperparameter Selection In DMP Increasing the temperature of softmax layer of the unprotected model, θup, used to transfer the knowledge of θup, can further reduce the membership leakage of Dtr. Similarly, reducing the size of Xref reduces MIA risk due to DMP, but comes at the cost of reduction in Atest.