Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

Authors: Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor Tsang, Yong Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 NUMERICAL EXPERIMENT In this section, we compare our DRCGD method with DRDGD (Chen et al., 2021) and DPRGD (Deng & Hu, 2023), which are first-order decentralized Riemannian optimization methods using retraction and projection respectively, on the following decentralized eigenvector problem: i=1 tr x i A i Aixi , s.t. x1 = . . . = xn, (22) ... The comparison results are shown in Figures 1, 2, and 3. It can be seen from Figure 1 that our DRCGD converges faster than DPRGD under different numbers of agents (n = 16 and n = 32). ... We also present some numerical results on the MNIST dataset (Le Cun, 1998).
Researcher Affiliation Collaboration Jun Chen1, Haishan Ye2,3, Mengmeng Wang1, Tianxin Huang1 Guang Dai3, Ivor W. Tsang4,5, Yong Liu1 1Zhejiang University 2Xi an Jiaotong University 3SGIT AI Lab, State Grid Corporation of China 4CFAR and IHPC, Agency for Science, Technology and Research 5SCSE, NTU
Pseudocode Yes Algorithm 1 Decentralized Riemannian Conjugate Gradient Descent (DRCGD) for solving Eq.(1). Input: Initial point x0 Mn, an integer t, set ηi,0 = grad fi(xi,0). 1: for k = 0, do for each node i [n], in parallel 2: Choose diminishing step size αk = O(1/ 3: Update xi,k+1 = PM Pn j=1 W t ijxj,k + αkηi,k 4: Compute βi,k+1 = grad fi(xi,k+1) 2/ grad fi(xi,k) 2 5: Update ηi,k+1 = grad fi(xi,k+1) + βi,k+1PTxi,k+1M Pn j=1 W t ijηj,k
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it include a link to a code repository.
Open Datasets Yes We also present some numerical results on the MNIST dataset (Le Cun, 1998).
Dataset Splits No The paper does not explicitly provide information about training/test/validation dataset splits, such as percentages, sample counts, or specific splitting methodologies. While it mentions the MNIST dataset and its use, it does not detail how the data was partitioned for training, validation, or testing.
Hardware Specification Yes The experiments are evaluated with the Intel(R) Core(TM) i7-12700 CPU.
Software Dependencies No The paper states 'And the codes are implemented in Python with mpi4py.' However, it does not provide specific version numbers for Python or the mpi4py library, which are necessary for reproducible software dependencies.
Experiment Setup Yes We employ fixed step sizes for all comparisons, i.e., the step size is set to αk = ˆα / K with K being the maximal number of iterations. ... We fix m1 = m2 = = mn = 1000, d = 10, and r = 5. ... We also fix the maximum iteration epoch to 200 and early terminate it if ds( xk, x ) < 10^-5. ... For brevity, we fix t = 1, r = 5, and d = 784, respectively. ... The step size of our DRCGD, DRDGD, and DPRGD is αk = ˆα / 60000. We set the maximum iteration epoch to 1000 and early terminate it if ds( xk, x ) < 10^-5.