reproducibilityindex.ai

Decentralized Matrix Sensing: Statistical Guarantees and Fast Convergence

Authors: Marie Maros, Gesualdo Scutari

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	All simulations are performed on a Apple M2 Pro @ 3.5 GHz computer, using 32 GB RAM running mac OS Ventura 13.3.1. We generate a random matrix X 2 Rd r?, with d = 50 and r? = 2, which we use through all experiments. The symmetric measurement matrices are generated as Ai = (1/2)(Si + S>i ), where Si 2 Rd d have i.i.d. standard Gaussian elements. The communication networks are generated as Erd os-Rényi graphs, with link activation probability p = 0.05. and different sizes m (speciﬁed in each experiment below). For any generated graph, we set W according to Metropolis weights [28], and then let W = W K, with the integer K chosen to meet the condition (12) on , resulting in K communication rounds per agent/iteration. Finally, we choose = 1/4 and µ = d 3. (i) Validating Theorem 1: This experiment shows that under the conditions of Theorem 1, the test error behaves predictably as that of the centralized GD (up to constant factors). Furthermore, the invariance of such an error with the network size m is also conﬁrmed, as long as O(1/m6) (as requested in (12)). In the experiments, the total sample size is N = 1000, split equally among agents m = {5, 100, 500, 1000}. Fig 1a plots the normalized test error; Fig 1b shows k(V ?M)>VU tk which measures the missalignment of U t with the power method matrix; and and Fig 1c displays the σr?(U t)/σr?(X) which combined with Fig 1b allows us to claim that the signal U t Qt is well aligned with VZ? and full ranked. The curves show that the behavior of the decentralized algorithm is close to that of the centralized GD (blue lines). As predicted, the error decays quickly after the correct subspace has been identiﬁed. Furthermore, convergence rate and generalization error are almost invariant to network-size scaling, as long as O(m 6). (ii) Validating condition (12) on : We showcase the necessity of decreasing while increasing m. Given a connected base graph with associated W, for the sequence of graphs generated with increasing m = {10, 50, 100, 500}, we let W = W T , with T such that 0.85, for all m. This eventually violates (12). Fig. 2 (resp. Fig. 2b ) plots the normalized generalization error versus the iterations, for m = 1, m = 10 and m = 50 (resp. m = 100 and m = 500, where the two curves are only up to the iterations t = 70 and t = 17, respectively). The ﬁgures demonstrate the necessity of scaling down with m increasing. In fact, both the rate and achievable estimation errors degrade (and eventually break down) as the network size increases while keeping ﬁxed. We claim that this stems from the fact that if the network is not sufﬁciently well-connected, the in-network RIP does not hold with sufﬁciently small tolerance, yielding to a failure of the power method early stage and consequently producing an unrecoverable missalignment with the signal subspace.
Researcher Affiliation	Academia	Marie Maros School of Industrial Engineering Purdue University mmaros@purdue.edu Gesualdo Scutari School of Industrial Engineering Purdue University gscutari@purdue.edu
Pseudocode	No	The paper presents the algorithm (3) using mathematical notation but does not include a clearly labeled “Pseudocode” or “Algorithm” block with structured steps.
Open Source Code	No	The paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	The paper describes generating its own synthetic data for experiments (“generate a random matrix X”, “symmetric measurement matrices are generated as Ai = (1/2)(Si + S>i ), where Si 2 Rd d have i.i.d. standard Gaussian elements”) and does not provide access information for a public dataset.
Dataset Splits	No	The paper mentions splitting the total sample size equally among agents but does not specify explicit train/validation/test dataset splits with percentages or counts.
Hardware Specification	Yes	All simulations are performed on a Apple M2 Pro @ 3.5 GHz computer, using 32 GB RAM running mac OS Ventura 13.3.1.
Software Dependencies	No	The paper mentions the operating system (“mac OS Ventura 13.3.1”) but does not list specific software libraries or dependencies with version numbers (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	We generate a random matrix X 2 Rd r?, with d = 50 and r? = 2, which we use through all experiments. The symmetric measurement matrices are generated as Ai = (1/2)(Si + S>i ), where Si 2 Rd d have i.i.d. standard Gaussian elements. The communication networks are generated as Erd os-Rényi graphs, with link activation probability p = 0.05. and different sizes m (speciﬁed in each experiment below). For any generated graph, we set W according to Metropolis weights [28], and then let W = W K, with the integer K chosen to meet the condition (12) on , resulting in K communication rounds per agent/iteration. Finally, we choose = 1/4 and µ = d 3.