Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
Authors: Christopher A. Choquette-Choo, Hugh Brendan Mcmahan, J Keith Rush, Abhradeep Guha Thakurta
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical evaluation on both examplelevel DP for image classification and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD. |
| Researcher Affiliation | Industry | 1Google Research. Correspondence to: <{cchoquette,krush,mcmahan,athakurta}@google.com>. |
| Pseudocode | Yes | Algorithm 1 DP-Prefix Sum Computation via FFT (with d = 1) |
| Open Source Code | Yes | Our code is at: https://github.com/google-research/ federated/tree/master/multi_epoch_dp_ matrix_factorization. |
| Open Datasets | Yes | We train image classification models on CIFAR10 (Krizhevsky, 2009)... and We use the standard benchmark: Stack Overflow next-word prediction (Reddi et al., 2020). |
| Dataset Splits | Yes | We train image-classification models using the CIFAR10 dataset as hosted in tensorflow-datasets, containing 50,000 training and 10,000 test examples. |
| Hardware Specification | No | The paper mentions "V100 GPU" in the context of computational cost for a specific component (optimal FFT decoder) but does not provide specific hardware details for running its main experiments. |
| Software Dependencies | No | The paper mentions "tensorflow-datasets" and "NumPy" (in Appendix K), but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Models trained for 20 epochs on CIFAR10 with a batch size of 500. We sweep over learning rates of values (1 10i, 2 10i, 5 10i) for i in { 2, 1}; We sweep over momentum values of 0, 0.85, 0.9, 0.95. |