reproducibilityindex.ai

Federated Principal Component Analysis

Authors: Andreas Grammenos, Rodrigo Mendoza Smith, Jon Crowcroft, Cecilia Mascolo

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical simulations show that, while using limitedmemory, our algorithm exhibits performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability. All our experiments were computed on a workstation using an AMD 1950X CPU with 16 cores at 4.0GHz, 128 GB 3200 MHz DDR4 RAM, and Matlab R2020a (build 9.8.0.1380330). To foster reproducibility both code and datasets used for our numerical evaluation are made publicly available at: https://www.github.com/andylamp/federated_pca. To quantify the loss with the application of differential private that our scheme has we compare the quality of the projections using the MNIST standard test set [30] and Wine [10] datasets
Researcher Affiliation	Collaboration	Andreas Grammenos1,3 Rodrigo Mendoza-Smith2 Jon Crowcroft1,3 Cecilia Mascolo1 1Computer Lab, University of Cambridge 2Quine Technologies 3Alan Turing Institute
Pseudocode	Yes	Our procedure is presented in Alg. 1. Algorithm 1: Federated PCA (FPCA) ... Merge and FPCA-Edge are described in Algs. 2 and 3. Algorithm 2: Merger [46, 17] ... Algorithm 3: Federated PCA Edge (FPCA-Edge)
Open Source Code	Yes	To foster reproducibility both code and datasets used for our numerical evaluation are made publicly available at: https://www.github.com/andylamp/federated_pca.
Open Datasets	Yes	To quantify the loss with the application of differential private that our scheme has we compare the quality of the projections using the MNIST standard test set [30] and Wine [10] datasets which contain, respectively, 10000 labelled images of handwritten digits and physicochemical data for 6498 variants of red and white wine.
Dataset Splits	No	The paper mentions using the 'MNIST standard test set' and 'Wine datasets' for evaluation, but it does not specify explicit training/validation/test splits, percentages, or methodology for partitioning the data for training and validation beyond implying the use of standard test sets. It applies FPCA to 'the same datasets' but does not detail how the data was split for training or validation of the FPCA model itself.
Hardware Specification	Yes	All our experiments were computed on a workstation using an AMD 1950X CPU with 16 cores at 4.0GHz, 128 GB 3200 MHz DDR4 RAM
Software Dependencies	Yes	and Matlab R2020a (build 9.8.0.1380330)
Experiment Setup	Yes	Then, on the same datasets, we applied FPCA with rank estimate r = 6, block size b = 25, and DP budget (ε, δ) = (0.1, 0.1). To evaluate the utility loss with respect to the privacy-accuracy trade-off we ﬁx δ = 0.01 and plot q A = v1, ˆv1 for ε {0.1k : k {1, . . . , 40}} where v1 and ˆv1 are deﬁned as in Lemma 2. Synthetic data was generated from a power-law spectrum2 Yα Synth(α)d n Rd n using α {0.01, 0.1, .5, 1}.