reproducibilityindex.ai

Deep Leakage from Gradients

Authors: Ligeng Zhu, Zhijian Liu, Song Han

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our attack is much stronger than previous approaches: the recovery is pixelwise accurate for images and token-wise matching for texts. Thereby we want to raise people s awareness to rethink the gradient s safety. We also discuss several possible strategies to prevent such deep leakage. Without changes on training setting, the most effective defense method is gradient pruning. We evaluate the effectiveness of our algorithm on both vision (image classiﬁcation) and language tasks (masked language model). On various datasets and tasks, DLG fully recovers the training data in just a few gradient steps.
Researcher Affiliation	Academia	Ligeng Zhu Zhijian Liu Song Han Massachusetts Institute of Technology {ligeng, zhijian, songhan}@mit.edu
Pseudocode	Yes	Algorithm 1 Deep Leakage from Gradients.
Open Source Code	No	The paper mentions "ofﬁcial implementation" for BERT with a footnote linking to `https://github.com/google-research/bert`. This is a third-party tool used, not the authors' own source code for the DLG methodology.
Open Datasets	Yes	We experiment our algorithm on modern CNN architectures Res Net-56 [12] and pictures from MNIST [22], CIFAR-100 [21], SVHN [28] and LFW [14]. [22] Yann Le Cun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/. [21] Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009. [28] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011. [14] Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007.
Dataset Splits	No	The paper describes the datasets used (MNIST, CIFAR-100, SVHN, LFW) but does not provide specific details on how these datasets were split into training, validation, and test sets, such as percentages or sample counts.
Hardware Specification	No	The paper mentions "GPU memory footprints" in the context of half-precision, but it does not specify any particular GPU models, CPU types, or other hardware details used for running the experiments.
Software Dependencies	No	The paper states: "we choose Py Torch [29] as our experiment platform." While PyTorch is mentioned, no specific version number is provided for it or any other software dependency.
Experiment Setup	Yes	We use L-BFGS [25] with learning rate 1, history size 100 and max iterations 20 and optimize for 1200 iterations and 100 iterations for image and text task respectively. Two changes we have made to the models are replacing activation Re LU to Sigmoid and removing strides, as our algorithm requires the model to be twice-differentiable. For image labels, instead of directly optimizing the discrete categorical values, we random initialize a vector with shape N C where N is the batch size and C is the number of classes, and then take its softmax output as the one-hot label for optimization. We list the iterations required for convergence for different batch sizes in Tab. 2 and provide visualized results in Fig. 6. The larger the batch size is, the more iterations DLG requires to attack.