Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion
Authors: Oren Mangoubi, Nisheeth Vishnoi
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our bounds are obtained by viewing the addition of Gaussian noise as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations allow us to bound the utility as the square-root of a sum-of-squares of perturbations to the eigenvectors, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Empirically, we observe that Assumption 2.1 is satisfied for covariance matrices of many real-world datasets (see Appendix J), as well as on Wishart random matrices W = A A, where A is an m d matrix of i.i.d. Gaussian entries, for sufficiently large m (see Appendix I). |
| Researcher Affiliation | Academia | Oren Mangoubi Worcester Polytechnic Institute Nisheeth K. Vishnoi Yale University |
| Pseudocode | No | The paper describes mathematical derivations and theoretical concepts, including stochastic differential equations and lemmas, but it does not present any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] (See Appendix I) |
| Open Datasets | Yes | We observe empirically that Assumption 2.1 is satisfied on a number of real-world datasets which were previously used as benchmarks in the differentially private matrix approximation literature [11, 2] (see Appendix J). Assumption 2.1 is also satisfied, for instance, by random Wishart matrices W = A A, where A is an m d matrix of i.i.d. Gaussian entries, which are a popular model for sample covariance matrices [47]. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits with specific percentages or sample counts. While it mentions empirical verification and Appendix I, the main text does not detail the data partitioning strategy for model training. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run its experiments, such as CPU/GPU models, memory, or cloud instance types. While the checklist indicates Appendix I, that appendix does not contain specific hardware specifications either. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software components, libraries, or solvers used in the experiments. It does not mention Python, PyTorch, or any other programming environment with version details. |
| Experiment Setup | No | The paper does not contain specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size), optimizer settings, or other concrete training configurations. While it mentions Appendix I for empirical verification, it does not provide these details in the main text. |