reproducibilityindex.ai

Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion

Authors: Oren Mangoubi, Nisheeth Vishnoi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our bounds are obtained by viewing the addition of Gaussian noise as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic diﬀerential equations discovered by Dyson. These equations allow us to bound the utility as the square-root of a sum-of-squares of perturbations to the eigenvectors, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Empirically, we observe that Assumption 2.1 is satisﬁed for covariance matrices of many real-world datasets (see Appendix J), as well as on Wishart random matrices W = A A, where A is an m d matrix of i.i.d. Gaussian entries, for suﬃciently large m (see Appendix I).
Researcher Affiliation	Academia	Oren Mangoubi Worcester Polytechnic Institute Nisheeth K. Vishnoi Yale University
Pseudocode	No	The paper describes mathematical derivations and theoretical concepts, including stochastic differential equations and lemmas, but it does not present any pseudocode or algorithm blocks.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] (See Appendix I)
Open Datasets	Yes	We observe empirically that Assumption 2.1 is satisﬁed on a number of real-world datasets which were previously used as benchmarks in the diﬀerentially private matrix approximation literature [11, 2] (see Appendix J). Assumption 2.1 is also satisﬁed, for instance, by random Wishart matrices W = A A, where A is an m d matrix of i.i.d. Gaussian entries, which are a popular model for sample covariance matrices [47].
Dataset Splits	No	The paper does not explicitly provide training/validation/test dataset splits with specific percentages or sample counts. While it mentions empirical verification and Appendix I, the main text does not detail the data partitioning strategy for model training.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run its experiments, such as CPU/GPU models, memory, or cloud instance types. While the checklist indicates Appendix I, that appendix does not contain specific hardware specifications either.
Software Dependencies	No	The paper does not provide specific version numbers for any software components, libraries, or solvers used in the experiments. It does not mention Python, PyTorch, or any other programming environment with version details.
Experiment Setup	No	The paper does not contain specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size), optimizer settings, or other concrete training configurations. While it mentions Appendix I for empirical verification, it does not provide these details in the main text.