Communication-efficient distributed eigenspace estimation with arbitrary node failures

Authors: Vasileios Charisopoulos, Anil Damle

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a brief numerical illustration of the performance of Algorithm 1 on data sampled from an unknown Gaussian distribution... Our experiment is illustrated in Figure 1. Clearly, the baseline methods break down in the presence of corruption, yielding solutions nearly orthogonal to V as α approaches 1/2. In contrast, the error of Alg. 1 degrades gracefully with α. We note that our algorithm yields a nontrivial solution even when almost half of the measurements are corrupted (α = 45%), in line with intuition suggesting that α = 1/2 is a natural breakdown point for outlier-robust algorithms.
Researcher Affiliation Academia Vasileios Charisopoulos Operations Research & Information Engineering Cornell University Ithaca, NY 14853 vc333@cornell.edu Anil Damle Computer Science Cornell University Ithaca, NY 14853 damle@cornell.edu
Pseudocode Yes Algorithm 1 Robust distributed eigenspace estimation... Algorithm 2 Robust Reference Estimator(Y1, . . . , Ym)... Algorithm 3 Procrustes Fixing({Y1, . . . , Ym} , Yref)... Algorithm 4 Filter(S := {Xi}i=1,...,m, λub)... Algorithm 5 Adaptive Filter(S = {Xi}i=1,...,m, λub, λlb, p, α)
Open Source Code Yes 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets No The paper uses data sampled from an unknown Gaussian distribution, which indicates synthetic data generation rather than the use of a pre-existing publicly available dataset. There is no link, DOI, or citation provided for a public dataset.
Dataset Splits No The paper describes a generative model for data and performs numerical studies, but it does not specify explicit training, validation, or test dataset splits in terms of percentages or counts for reproducing experiments on a fixed dataset.
Hardware Specification No 3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]
Software Dependencies No The paper does not mention specific software names with version numbers that would be needed to replicate the experiments. It does not list any libraries, frameworks, or solvers with their versions.
Experiment Setup Yes 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]. ...We fix the gap δ = 0.25 throughout. ... We fix r = 5, r = 2r, κ = 5 and δ = 0.25 and experiment with (m, n) = (32, 2i) | i = 5, . . . , 10 , (m, n) = (2i, 128) | i = 4, . . . , 9 , α {0.25, 0.45} .