Communication-Efficient Distributed PCA by Riemannian Optimization
Authors: Long-Kai Huang, Sinno Pan
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical performance is evaluated on three real-world datasets with different scales. These datasets are a9a, CIFAR-10 and rcv1. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Nanyang Technological University, Singapore. |
| Pseudocode | Yes | Algorithm 1 CEDRE(w0, S, m, η) and Algorithm 2 RST-CEDRE( ˆw0, R, S, m, η) are provided. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a URL or explicit statement) for open-source code. |
| Open Datasets | Yes | We implement CEDRE with a manifold optimization toolbox manopt (Boumal et al., 2014) on a distributed computing platform MATLAB Parallel Server with multiple computers. The empirical performance is evaluated on three real-world datasets with different scales. These datasets are a9a, CIFAR-10 and rcv1. ... For a9a and CIFAR-10, we only use the training sets. For rcv1, we combine the original training sets and testing sets of rcv1.binary and rcv1.multiclass datasets to construct a large-scale dataset. |
| Dataset Splits | No | The paper mentions using "training sets" for some datasets and combining train/test sets for another, but does not specify explicit training/validation/test splits (e.g., percentages or counts) or cross-validation setup for their experiments. It states: "The data instances are randomly and evenly partitioned over K =100 local machines." This describes data distribution, not standard validation splits. |
| Hardware Specification | No | The paper states: "We implement CEDRE with a manifold optimization toolbox manopt (Boumal et al., 2014) on a distributed computing platform MATLAB Parallel Server with multiple computers." This does not include specific hardware details like CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions "a manifold optimization toolbox manopt (Boumal et al., 2014)" and "MATLAB Parallel Server". It does not provide specific version numbers for MATLAB, manopt, or any other relevant software libraries or dependencies. |
| Experiment Setup | Yes | As for the settings of CEDRE, we run the stochastic update, which means Step 10 in Algorithm 1 is replaced by (3) with batch size B =1 and with option II. In addition, we set the local iteration length m=5n and choose the step size η based on the best training loss with one communication round. |