Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Balancing Utility and Privacy: Dynamically Private SGD with Random Projection
Authors: Zhanhong Jiang, Md Zahid Hasan, Nastaran Saadati, Aditya Balu, Chao Liu, Soumik Sarkar
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across diverse datasets show that D2P2-SGD remarkably enhances accuracy while maintaining privacy. Our code is available here. ... Extensive evaluations on a wide spectrum of datasets confirm that D2P2-SGD significantly improves model accuracy compared to baseline methods. |
| Researcher Affiliation | Academia | Zhanhong Jiang# EMAIL Md Zahid Hasan** EMAIL Nastaran Saadati* EMAIL Aditya Balu# EMAIL Chao Liu*** EMAIL Soumik Sarkar*# EMAIL *Department of Mechanical Engineering, #Translational AI Center, **Department of Electrical and Computer Engineering, Iowa State University ***Department of Energy and Power Engineering, Tsinghua University |
| Pseudocode | Yes | Algorithm 1 D2P2-SGD 1: Initialize: Model parameters x1, step size α, number of epochs K, lower dimension p, random matrices A1, A2, . . . , AK, mini-batch size B, training dataset D, noise sequence σ2 ϵ,1, σ2 ϵ,2, . . . , σ2 ϵ,K, gradient clipping parameter γ 2: for k = 1, . . . , K do 3: Split the dataset D into mini-batches of size B and randomly sample one mini-batch B 4: Compute per-sample clipped gradients: ˆgs k = f(xk;s) f(xk;s) +γ , s B 5: Calculate the mini-batch stochastic gradient: gk = 1 B P s B ˆgs k 6: Project noisy gradient using Ak: gk = Ak 1 p A k gk + ϵk , ϵk N(0, σ2 ϵ,k Ip) 7: Update model parameters: xk+1 = xk α gk 8: end for 9: return x K |
| Open Source Code | Yes | Extensive experiments across diverse datasets show that D2P2-SGD remarkably enhances accuracy while maintaining privacy. Our code is available here. |
| Open Datasets | Yes | Additionally, the datasets for testing our algorithms include Fashion MNIST and SVHN Figueroa (2019). ...In Figures 8 and 9, results for the CIFAR-10 dataset are provided... ...Similarly, for Figures 11-13 (KMNIST, EMNIST, MNIST), D2P2-SGD is favorably comparable to or outperforms all baselines... |
| Dataset Splits | No | Split the dataset D into mini-batches of size B and randomly sample one mini-batch B |
| Hardware Specification | Yes | All the experiments were conducted on a machine equipped with an Intel Xeon Silver 4110 CPU and an NVIDIA Titan RTX GPU. |
| Software Dependencies | No | We leverage the Opacus library Yousefpour et al. (2021) and build the framework on top of it. |
| Experiment Setup | Yes | Table 6: Hyperparameters for experiments. Hyperparameter Value Learning rate α 0.01 Clipping parameter γ 0.01 Batch size B (256, 512, 1024) Number of Epoch K 40 Injected noise variance σϵ 3.0 Sampling variance 1 Percentage of dimensionality reduction 0.7 Number of random seeds 4 |