Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable
Authors: Martin Bertran, Shuai Tang, Michael Kearns, Jamie H. Morgenstern, Aaron Roth, Steven Z. Wu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We assess our attack across diverse datasets, including tabular and image data, and for both classification and regression tasks. Initially, we train a model on the complete dataset Xpriv, ypriv to derive parameters β+, and then retrain it excluding a single sample (x, y) to obtain β . |
| Researcher Affiliation | Collaboration | Martin Bertran Amazon AWS AI/ML Shuai Tang Jump Trading Michael Kearns University of Pennsylvania Amazon AWS AI/ML Jamie Morgenstern University of Washington Amazon AWS AI/ML Aaron Roth University of Pennsylvania Amazon AWS AI/ML Zhiwei Steven Wu Carnegie Mellon University Amazon AWS AI/ML |
| Pseudocode | Yes | Algorithm 1 Generalized Attack Require: Public data Xpub Rm d, ypub Rm Require: Parameter vectors β+, β Rd Require: Loss function ℓ(β) Require: Embedding function ϕ Ensure: Reconstructed sample x Estimate the Hessian ˆH using Eq. (13) Reconstruct the embedding z using Eq. (12) if ϕ(x) = x then Directly recover x = z else Reconstruct the input x using Eq. (6) end if Return x |
| Open Source Code | No | Open source code will be provided at a later date |
| Open Datasets | Yes | We assess our attack across diverse datasets... on Fashion MNIST (FMNIST), MNIST, and CIFAR10 datasets... (Xiao et al., 2017; Le Cun et al., 1998; Krizhevsky et al., 2009). |
| Dataset Splits | No | On each task, we split the dataset into three splits, including 40% for training the target model, another 40% as the public samples for learning shadow models, and the rest 20% as the holdout set for evaluation. |
| Hardware Specification | No | The computational costs of the experiments were small enough that they could be run serially on a single GPU machine without great effort. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Each dataset undergoes a normalization process where input features are scaled to the range [ 1, 1]. |