Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning

Authors: Kostadin Garov, Dimitar Iliev Dimitrov, Nikola Jovanović, Martin Vechev

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an extensive experimental evaluation of SEER on several datasets and realistic network architectures, demonstrating that it is able to recover private client data from batches as large as 512, even under the presence of secure aggregation (Sec. 5).
Researcher Affiliation Academia Kostadin Garov1 Dimitar I. Dimitrov1,2 Nikola Jovanovi c2 Martin Vechev2 1 INSAIT, Sofia University "St. Kliment Ohridski" 2 ETH Zurich
Pseudocode Yes Algorithm 1 describes the training procedure of SEER. Algorithm 2 Mounting SEER
Open Source Code Yes We provide an implementation of SEER at https://github.com/insait-institute/SEER.
Open Datasets Yes We use the CIFAR10 dataset, as well as CIFAR100 [Krizhevsky et al., 2009] and Image Net [Deng et al., 2009], to demonstrate the ability of SEER to scale with the number of labels and input size, respectively.
Dataset Splits Yes We generally use the training set as auxiliary data, and mount the attack on randomly sampled batches of size B from the test set for CIFAR10/100 and validation set for Image Net.
Hardware Specification Yes We run all experiments on a single NVIDIA A100 GPU with 40GB (CIFAR10/100) and 80GB (Image Net) of VRAM.
Software Dependencies Yes We implemented SEER in Pytorch 1.13.
Experiment Setup Yes Throughout our experiments, we used the Adam optimizer with a learning rate of 0.0001. For CIFAR10/100, we trained between 500 and 1000 epochs, where an epoch is defined to be 1000 sampled batches from our trainset. To stabilize our training convergence, we adopted gradient accumulation and, thus, updated our modules parameters only once every 10 gradient steps amounting to 100 gradient descent steps per epoch for those datasets.