reproducibilityindex.ai

Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach

Authors: Xinwei Zhang, Zhiqi Bu, Steven Wu, Mingyi Hong

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results on standard datasets show that the proposed algorithm achieves higher accuracies than DPSGD while maintaining the same level of DP guarantee.
Researcher Affiliation	Collaboration	Xinwei Zhang University of Minnesota zhan6234@umn.edu Zhiqi Bu Amazon AI. woodyx218@gmail.com Zhiwei Steven Wu Carnegie Mellon University zstevenwu@cmu.edu Mingyi Hong University of Minnesota mhong@umn.edu
Pseudocode	Yes	Algorithm 1 DPSGD Algorithm with Gradient Clipping, Algorithm 2 Dice SGD Algorithm, Algorithm 3 Adam variant of Dice SGD Algorithm, Algorithm 4 Automatic Dice SGD Algorithm (without C1, C2)
Open Source Code	No	The paper does not contain any statement about releasing code or a link to a code repository.
Open Datasets	Yes	We use both Cifar-10 and Cifar-100 datasets for experiments and use Vi Tsmall (Dosovitskiy et al., 2020) as the training model, which is pre-trained on Imagenet.
Dataset Splits	No	The paper mentions using CIFAR-10, CIFAR-100, and E2E NLG Challenge datasets. While these are standard benchmarks, the paper does not explicitly provide specific percentages, sample counts, or direct citations to predefined split methodologies in the text. It talks about 'fine-tuning' and 'batch size' but not the split ratios.
Hardware Specification	Yes	The experiments were run on an Intel Xeon W-2102 CPU with an NVIDIA TITAN X GPU for image classification, and on an NVIDIA A100 GPU for NLP tasks.
Software Dependencies	No	The paper mentions using 'Adam variant of DPSGD-GC developed following Bu et al. (2021)' and 'GPT-2 model (Radford et al., 2018)', but does not provide specific software names with version numbers for replication (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We fine-tune the model for 3 epochs with batch size B = 1000. The stepsize for DPSGD-GC and Dice SGD are selected through grid search from η {10 2, 10 3, 10 4}. For GPT-2: fine-tune the GPT-2 model (Radford et al., 2018) on the E2E NLG Challenge for 10 epochs with batch size B = 1000 and initial stepsize η0 = 2 * 10 3 with learning rate warm-up and linear decay.