reproducibilityindex.ai

Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

Authors: Christopher A. Choquette-Choo, Hugh Brendan Mcmahan, J Keith Rush, Abhradeep Guha Thakurta

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical evaluation on both examplelevel DP for image classiﬁcation and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD.
Researcher Affiliation	Industry	1Google Research. Correspondence to: <{cchoquette,krush,mcmahan,athakurta}@google.com>.
Pseudocode	Yes	Algorithm 1 DP-Preﬁx Sum Computation via FFT (with d = 1)
Open Source Code	Yes	Our code is at: https://github.com/google-research/ federated/tree/master/multi_epoch_dp_ matrix_factorization.
Open Datasets	Yes	We train image classiﬁcation models on CIFAR10 (Krizhevsky, 2009)... and We use the standard benchmark: Stack Overﬂow next-word prediction (Reddi et al., 2020).
Dataset Splits	Yes	We train image-classiﬁcation models using the CIFAR10 dataset as hosted in tensorflow-datasets, containing 50,000 training and 10,000 test examples.
Hardware Specification	No	The paper mentions "V100 GPU" in the context of computational cost for a specific component (optimal FFT decoder) but does not provide specific hardware details for running its main experiments.
Software Dependencies	No	The paper mentions "tensorflow-datasets" and "NumPy" (in Appendix K), but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	Models trained for 20 epochs on CIFAR10 with a batch size of 500. We sweep over learning rates of values (1 10i, 2 10i, 5 10i) for i in { 2, 1}; We sweep over momentum values of 0, 0.85, 0.9, 0.95.