reproducibilityindex.ai

Leave-one-out Distinguishability in Machine Learning

Authors: Jiayuan Ye, Anastasia Borovykh, Soufiane Hayou, Reza Shokri

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate that LOOD under NNGP models correlates well with the performance of membership inference attacks (Ye et al., 2022; Carlini et al., 2023) against neural networks in the leave-one-out setting for benchmark datasets (Figure 1b). We also experimentally show that mean distance LOOD under NNGP agrees with the prediction differences under leave-one-out retraining (Figure 1c)of deep neural networks.
Researcher Affiliation	Academia	National University of Singapore, Imperial College London
Pseudocode	No	The paper describes algorithms and methods using prose, mathematical equations, and figures but does not include any clearly labeled pseudocode blocks or algorithm sections.
Open Source Code	Yes	The code is available through this link. (Footnote 1 on page 1)
Open Datasets	Yes	We use Gaussian processes to model the randomness of machine learning algorithms, and validate LOOD with extensive empirical analysis of leakage using membership inference attacks. Our analytical framework enables us to investigate the causes of leakage and where the leakage is high. For example, we analyze the influence of activation functions, on data memorization. Additionally, our method allows us to identify queries that disclose the most information about the training data in the leave-one-out setting. We illustrate how optimal queries can be used for accurate reconstruction of training data.
Dataset Splits	Yes	The leave-one-out dataset D is a class-balanced subset of CIFAR-10 with size 1000, and D' = D \ S for a differing record S. We evaluate over 200 randomly chosen S (from CIFAR-10 or uniform distribution over the pixel domain).
Hardware Specification	No	The paper mentions '100 GPU hours for training 50-90 FC networks' and '40 GPU minutes' for estimating LOOD (Footnote 6), but it does not specify any particular GPU models (e.g., NVIDIA V100, A100), CPU types, or memory sizes.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow version, specific library versions).
Experiment Setup	Yes	Examples include '10-layer FC network with Re LU activation' (Figure 2), 'fully connected network with depth 10 and width 1024' (Figure 5), and for a toy dataset, 'Noise parameter is given by σ2 = 0.01' and 'Kernel is given by K(x, x') = exp(- (x - x')2/2)' (Appendix F.1).