Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Balancing Utility and Privacy: Dynamically Private SGD with Random Projection

Authors: Zhanhong Jiang, Md Zahid Hasan, Nastaran Saadati, Aditya Balu, Chao Liu, Soumik Sarkar

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across diverse datasets show that D2P2-SGD remarkably enhances accuracy while maintaining privacy. Our code is available here. ... Extensive evaluations on a wide spectrum of datasets confirm that D2P2-SGD significantly improves model accuracy compared to baseline methods.
Researcher Affiliation	Academia	Zhanhong Jiang# EMAIL Md Zahid Hasan** EMAIL Nastaran Saadati* EMAIL Aditya Balu# EMAIL Chao Liu*** EMAIL Soumik Sarkar# EMAIL Department of Mechanical Engineering, #Translational AI Center, Department of Electrical and Computer Engineering, Iowa State University *Department of Energy and Power Engineering, Tsinghua University
Pseudocode	Yes	Algorithm 1 D2P2-SGD 1: Initialize: Model parameters x1, step size α, number of epochs K, lower dimension p, random matrices A1, A2, . . . , AK, mini-batch size B, training dataset D, noise sequence σ2 ϵ,1, σ2 ϵ,2, . . . , σ2 ϵ,K, gradient clipping parameter γ 2: for k = 1, . . . , K do 3: Split the dataset D into mini-batches of size B and randomly sample one mini-batch B 4: Compute per-sample clipped gradients: ˆgs k = f(xk;s) f(xk;s) +γ , s B 5: Calculate the mini-batch stochastic gradient: gk = 1 B P s B ˆgs k 6: Project noisy gradient using Ak: gk = Ak 1 p A k gk + ϵk , ϵk N(0, σ2 ϵ,k Ip) 7: Update model parameters: xk+1 = xk α gk 8: end for 9: return x K
Open Source Code	Yes	Extensive experiments across diverse datasets show that D2P2-SGD remarkably enhances accuracy while maintaining privacy. Our code is available here.
Open Datasets	Yes	Additionally, the datasets for testing our algorithms include Fashion MNIST and SVHN Figueroa (2019). ...In Figures 8 and 9, results for the CIFAR-10 dataset are provided... ...Similarly, for Figures 11-13 (KMNIST, EMNIST, MNIST), D2P2-SGD is favorably comparable to or outperforms all baselines...
Dataset Splits	No	Split the dataset D into mini-batches of size B and randomly sample one mini-batch B
Hardware Specification	Yes	All the experiments were conducted on a machine equipped with an Intel Xeon Silver 4110 CPU and an NVIDIA Titan RTX GPU.
Software Dependencies	No	We leverage the Opacus library Yousefpour et al. (2021) and build the framework on top of it.
Experiment Setup	Yes	Table 6: Hyperparameters for experiments. Hyperparameter Value Learning rate α 0.01 Clipping parameter γ 0.01 Batch size B (256, 512, 1024) Number of Epoch K 40 Injected noise variance σϵ 3.0 Sampling variance 1 Percentage of dimensionality reduction 0.7 Number of random seeds 4