Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Banded Square Root Matrix Factorization for Differentially Private Model Training
Authors: Kalinin Nikita, Christoph H. Lampert
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments demonstrate that models trained using BSR perform on par with the best existing methods, while completely avoiding their computational overhead. |
| Researcher Affiliation | Academia | Nikita Kalinin Institute of Science and Technology (ISTA) Klosterneuburg, Austria EMAIL Christoph Lampert Institute of Science and Technology (ISTA) Klosterneuburg, Austria EMAIL |
| Pseudocode | Yes | Algorithm 1 Differentially Private SGD with Matrix Factorization |
| Open Source Code | Yes | To compute AOF, we solve the optimization problem (4) using the cvxpy package with SCS backend, see Algorithm B for the source code3. |
| Open Datasets | Yes | To demonstrate the usefulness of BSR in practical settings, we follow the setup of Kairouz et al. [2021] and report results for training a simple Conv Net on the CIFAR-10 dataset (see Table 1 in Appendix C for the architecture). |
| Dataset Splits | Yes | In both cases, 20% of the training examples are used as validation sets to determine the learning rate η {0.01, 0.05, 0.1, 0.5, 1}, weight decay parameters α {0.99, 0.999, 0.9999, 1}, and momentum β {0, 0.9}. |
| Hardware Specification | Yes | Note that while the experiments for BSR and CVX used a single-core CPU-only environment, the experiments for GD and LBFGS were run on an NVIDIA H100 GPU with 16 available CPU cores. |
| Software Dependencies | No | The paper mentions software like 'python/numpy code', 'cvxpy package with SCS backend', 'jax', and 'optax toolbox', but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | To reflect the setting of single-participation training, we split the 50,000 training examples into batches of size m {1000, 500, 250, 200, 100, 50, 25}, resulting in n {100, 200, 400, 500, 1000, 2000} update steps. For repeated participation, we fix the batch size to 500 and run k {1, 2, . . . , 10, 15, 20} epoch of training, i.e. n = 100k and b = 100. In both cases, 20% of the training examples are used as validation sets to determine the learning rate η {0.01, 0.05, 0.1, 0.5, 1}, weight decay parameters α {0.99, 0.999, 0.9999, 1}, and momentum β {0, 0.9}. |