Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fundamental Limits of Membership Inference Attacks on Machine Learning Models
Authors: Eric Aubinais, Elisabeth Gassiat, Pablo Piantanida
JMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate those results through simple simulations. Numerical experiments are presented in Section 6.1, and the proof of Theorem 6 is detailed in Appendix E. In this section, we propose two numerical experiments to illustrate our results in Sections 4 and 5. All simulations have been conducted with the Pytorch library. We refer to Appendix D for more details on the experiments. |
| Researcher Affiliation | Academia | Eric Aubinais EMAIL Universit e Paris-Saclay, CNRS, Laboratoire de math ematiques d Orsay, 91405, Orsay, France; Elisabeth Gassiat EMAIL Universit e Paris-Saclay, CNRS, Laboratoire de math ematiques d Orsay, 91405, Orsay, France; Pablo Piantanida EMAIL ILLS International Laboratory on Learning Systems, MILA Quebec AI Institute, Montreal (QC), Canada, CNRS, Centrale Sup elec Universit e Paris-Saclay |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. Methods are described in prose. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | To illustrate the impact of a discretization through the constant CK(P), we trained several 3-layers neural networks to classify samples from the MNIST dataset (Deng, 2012). |
| Dataset Splits | Yes | We generated 10000 test samples xtest 1 , , xtest 10000 i.i.d. Unif(Sd 1) independently from the training dataset. We consider here the whole MNIST dataset, and use the given separation between training and test. For each dataset size n = 1000, 5000, 10000, we performed the following steps: We draw a dataset Dn of size n not containing the samples used for the clusterings. |
| Hardware Specification | No | The paper mentions simulations were conducted with a software library (PyTorch) but does not specify any hardware details like CPU, GPU, or memory used. |
| Software Dependencies | No | All simulations have been conducted with the Pytorch library. We constructed the clusterizing learning procedure based on the Mini Batch KMeans function from scikit-learn library. However, specific version numbers for these libraries are not provided. |
| Experiment Setup | Yes | We train the neural network Ψθ by minimizing the MSE loss with the Adam optimizer and learning rate 0.1 for 2500 iterations. We trained the neural networks by minimizing the cross-entropy loss with the Adam optimizer and learning 0.01 for 500 iterations. |