Leave-one-out Distinguishability in Machine Learning
Authors: Jiayuan Ye, Anastasia Borovykh, Soufiane Hayou, Reza Shokri
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate that LOOD under NNGP models correlates well with the performance of membership inference attacks (Ye et al., 2022; Carlini et al., 2023) against neural networks in the leave-one-out setting for benchmark datasets (Figure 1b). We also experimentally show that mean distance LOOD under NNGP agrees with the prediction differences under leave-one-out retraining (Figure 1c)of deep neural networks. |
| Researcher Affiliation | Academia | National University of Singapore, Imperial College London |
| Pseudocode | No | The paper describes algorithms and methods using prose, mathematical equations, and figures but does not include any clearly labeled pseudocode blocks or algorithm sections. |
| Open Source Code | Yes | The code is available through this link. (Footnote 1 on page 1) |
| Open Datasets | Yes | We use Gaussian processes to model the randomness of machine learning algorithms, and validate LOOD with extensive empirical analysis of leakage using membership inference attacks. Our analytical framework enables us to investigate the causes of leakage and where the leakage is high. For example, we analyze the influence of activation functions, on data memorization. Additionally, our method allows us to identify queries that disclose the most information about the training data in the leave-one-out setting. We illustrate how optimal queries can be used for accurate reconstruction of training data. |
| Dataset Splits | Yes | The leave-one-out dataset D is a class-balanced subset of CIFAR-10 with size 1000, and D' = D \ S for a differing record S. We evaluate over 200 randomly chosen S (from CIFAR-10 or uniform distribution over the pixel domain). |
| Hardware Specification | No | The paper mentions '100 GPU hours for training 50-90 FC networks' and '40 GPU minutes' for estimating LOOD (Footnote 6), but it does not specify any particular GPU models (e.g., NVIDIA V100, A100), CPU types, or memory sizes. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow version, specific library versions). |
| Experiment Setup | Yes | Examples include '10-layer FC network with Re LU activation' (Figure 2), 'fully connected network with depth 10 and width 1024' (Figure 5), and for a toy dataset, 'Noise parameter is given by σ2 = 0.01' and 'Kernel is given by K(x, x') = exp(- (x - x')2/2)' (Appendix F.1). |