reproducibilityindex.ai

Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning

Authors: Kostadin Garov, Dimitar Iliev Dimitrov, Nikola Jovanović, Martin Vechev

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present an extensive experimental evaluation of SEER on several datasets and realistic network architectures, demonstrating that it is able to recover private client data from batches as large as 512, even under the presence of secure aggregation (Sec. 5).
Researcher Affiliation	Academia	Kostadin Garov1 Dimitar I. Dimitrov1,2 Nikola Jovanovi c2 Martin Vechev2 1 INSAIT, Sofia University "St. Kliment Ohridski" 2 ETH Zurich
Pseudocode	Yes	Algorithm 1 describes the training procedure of SEER. Algorithm 2 Mounting SEER
Open Source Code	Yes	We provide an implementation of SEER at https://github.com/insait-institute/SEER.
Open Datasets	Yes	We use the CIFAR10 dataset, as well as CIFAR100 [Krizhevsky et al., 2009] and Image Net [Deng et al., 2009], to demonstrate the ability of SEER to scale with the number of labels and input size, respectively.
Dataset Splits	Yes	We generally use the training set as auxiliary data, and mount the attack on randomly sampled batches of size B from the test set for CIFAR10/100 and validation set for Image Net.
Hardware Specification	Yes	We run all experiments on a single NVIDIA A100 GPU with 40GB (CIFAR10/100) and 80GB (Image Net) of VRAM.
Software Dependencies	Yes	We implemented SEER in Pytorch 1.13.
Experiment Setup	Yes	Throughout our experiments, we used the Adam optimizer with a learning rate of 0.0001. For CIFAR10/100, we trained between 500 and 1000 epochs, where an epoch is defined to be 1000 sampled batches from our trainset. To stabilize our training convergence, we adopted gradient accumulation and, thus, updated our modules parameters only once every 10 gradient steps amounting to 100 gradient descent steps per epoch for those datasets.