reproducibilityindex.ai

Decentralized Riemannian Gradient Descent on the Stiefel Manifold

Authors: Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report the convergence results of DRSGD, DRDGD and DRGTA with different t and ˆβ on synthetic data. We ﬁx m1 = . . . = mn = 1000, d = 100 and r = 5 and generate m1 n i.i.d samples following standard multivariate Gaussian distribution to obtain A. Let A = USV be the truncated SVD. Given an eigengap (0, 1), we modify the singular values of A to be a geometric sequence, i.e. Si,i = S0,0 i/2, i [d]. Typically, larger results in more difﬁcult problem. In Figure 1, we show the results of DRSGD, DRDGD and DRGTA on the data with n = 32 and = 0.8. The y-axis is the log-scale distance. The ﬁrst four lines in each testing case are for the ring graph, and the last one is on a complete graph with equally weighted matrix, which aims to show the case of t . In Figure 1(a), when ﬁxing ˆβ, it is shown that that smaller ˆβ produces higher accuracy, which veriﬁes Theorem 4.2. We see DRSGD performs almost the same with different t {1, 10, }. For the two deterministic algorithms DRDGD and DRGTA, we see that DRDGD can use larger ˆβ if more communication rounds t is used in Figure 1(b),(c). DRDGD cannot achieve exact convergence with constant stepsize, while DRGTA successfully solves the problem using t {10, }, ˆβ = 0.05. We provide some numerical results on the MNIST dataset (Le Cun).
Researcher Affiliation	Academia	1The Wm Michael Barnes 64 Department of Industrial and Systems Engineering, Texas A&M University, College Station, TX 77843, USA. 2The Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455, USA.
Pseudocode	Yes	Algorithm 1 Decentralized Riemannian Stochastic Gradient Descent (DRSGD) for Solving (1.1) Algorithm 2 Decentralized Riemannian Gradient Tracking over Stiefel manifold (DRGTA) for Solving (1.1)
Open Source Code	Yes	For reproducibility of results, our code is made available at https://github.com/ chenshixiang/Decentralized_Riemannian_ gradient_descent_on_Stiefel_manifold.
Open Datasets	Yes	We provide some numerical results on the MNIST dataset (Le Cun).
Dataset Splits	No	The paper refers to using datasets for 'epochs' and 'iterations' but does not explicitly describe train/validation/test splits, nor does it refer to specific predefined splits or cross-validation methods.
Hardware Specification	Yes	The experiments are evaluated in a HPC cluster, where each computation node is an Intel Xeon 6248R CPU. The computation nodes are connected by Mellanox HDR 100 Inﬁni Band.
Software Dependencies	No	The codes are implemented in python with mpi4py (Dalc ın et al., 2005). While Python and mpi4py are mentioned, specific version numbers for these software components are not provided.
Experiment Setup	Yes	For DRSGD, we set the maximum epoch to 200 and early stop it if ds( xk, x ) 10 5. For DRGTA and DRDGD, we set the maximum iteration number to 104 and the termination condition is ds( xk, x ) 10 8 or gradf( xk) F 10 8. We set βk = ˆβ 1 n Pn i=1 mi for DRGTA and DRDGD where ˆβ will be speciﬁed later. For DRSGD, we set β = ˆβ / 200. We ﬁx α = 1 and generate the initial points uniformly randomly satisfying x1,0 = . . . = xn,0 M. We set the maximum epoch as 300 in all experiments. The stepsize is set to β = n 10000 / 300 ˆβ, where ˆβ is tuned for the best performance.