Communication Efficient Parallel Algorithms for Optimization on Manifolds

Authors: Bayan Saparbayeva, Michael Zhang, Lizhen Lin

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To examine the quality of our parallel algorithm we first apply it to the estimation of Fréchet means on spheres, which has closed form expressions for the estimation of the extrinsic mean (true empirical minimizer). In addition, we apply our algorithm to Netflix movie-ranking data set as an example of optimization over Grassmannian manifolds in the low-rank matrix completion problem. In the following results, we demonstrate the utility of our algorithm both for high dimensional manifold-valued data (Section 5.1) and Euclidean space data with non-Euclidean parameters (Section 5.2).
Researcher Affiliation Academia Bayan Saparbayeva Department of Applied and Computational Mathematics and Statistics University of Notre Dame Notre Dame, Indiana 46556, USA bsaparba@nd.edu Michael Minyi Zhang Department of Computer Science Princeton University Princeton, New Jersey 08540, USA mz8@cs.princeton.edu Lizhen Lin Department of Applied and Computational Mathematics and Statistics University of Notre Dame Notre Dame, Indiana 46556, USA lizhen.lin@nd.edu
Pseudocode Yes Algorithm 1: ILEA for Manifolds
Open Source Code Yes Our code is available at https://github.com/michaelzhang01/parallel_manifold_opt
Open Datasets No We apply our algorithm to Netflix movie-ranking data set as an example of optimization over Grassmannian manifolds in the low-rank matrix completion problem. In our example we set the matrix rank to r = 10 and the regularization parameter to λ = 0.1 and divided the data randomly across 4 processors. For this example, we simulate one million observations from a 100-dimensional von Mises distribution projected onto the unit sphere with mean sampled randomly from N(0,I) and a precision of 2. The paper names the datasets used but does not provide any specific link, DOI, repository name, or formal citation (with author and year) for accessing them.
Dataset Splits No The paper mentions applying the algorithm to the Netflix movie-ranking data set and simulated data, and states 'divided the data randomly across 4 processors' and 'we run 20 trials of our algorithm over 1, 2, 4, 6, 8 and 10 processors.' However, it does not provide specific training/validation/test dataset splits by percentages or sample counts, nor does it reference predefined splits with citations.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, or cloud resources) used for running its experiments.
Software Dependencies No We wrote the code for our implementations in Python and carried out the parallelization of the code through MPI1[7]. The paper mentions 'Python' and 'MPI' but does not provide specific version numbers for these software components.
Experiment Setup Yes In our example we set the matrix rank to r = 10 and the regularization parameter to λ = 0.1 and divided the data randomly across 4 processors. We select the step size parameter according to the modified Armijo algorithm seen in [6].