reproducibilityindex.ai

Stochastic Distributed Optimization under Average Second-order Similarity: Algorithms and Analysis

Authors: Dachao Lin, Yuze Han, Haishan Ye, Zhihua Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To demonstrate the advantages of our algorithms, we conduct the same numerical experiments as those in [35, 33]. We focus on the linear ridge regression problem with ℓ2 regularization, where the average loss f has the formulation: f(x) = 1/n Pn i=1 fi(x) := 1/m Pm j=1 z i,jx yi,j 2 + µ/2 x 2. Here zi,j Rd and yi,j R, i [n], j [m] serve as the feature and label respectively, and m can be viewed as data size in each local client. We consider a synthetic dataset generated by adding a small random noise matrix to the center matrix, ensuring a small δ. To capture the differences in convergence rates between our methods and SVRP caused by different magnitudes of µ, we vary µ = 10 i, i {0, 1, 2}. We compare our methods (SVRS and Acc SVRS) against SVRG, Katyusha X, SVRP (Catalyzed SVRP is somehow hard to tune so we omit it), and Accelerated Extragradient (Acc EG) using their theoretical step sizes, except that we scale the interpolation parameter τ in Katyusha X and Acc SVRS for producing practical performance (see Appendix G for detail). From Figure 1, we can observe that for a large µ, SVRP outperforms existing algorithms due to its high-order dependence on µ. However, when the problem becomes ill-conditioned with a small µ, Acc SVRS exhibits significant improvements compared to other algorithms.
Researcher Affiliation	Collaboration	Dachao Lin Yuze Han Haishan Ye Zhihua Zhang Equal Contribution. Academy for Advanced Interdisciplinary Studies; Peking University; lindachao@pku.edu.cn; School of Mathematical Sciences; Peking University; hanyuze97@pku.edu.cn; Corresponding Author; School of Management; Xi an Jiaotong University; SGIT AI Lab, State Grid Corporation of China; yehaishan@xjtu.edu.cn; School of Mathematical Sciences; Peking University; zhzhang@math.pku.edu.cn.
Pseudocode	Yes	Algorithm 1 SVRS1ep(f, w0, θ, p) Algorithm 2 Accelerated SVRS (Acc SVRS)
Open Source Code	No	The paper does not provide any statement about releasing source code or links to a code repository for the methodology described.
Open Datasets	No	The paper states: 'We consider a synthetic dataset generated by adding a small random noise matrix to the center matrix, ensuring a small δ.' However, it does not provide any concrete access information (link, DOI, formal citation) for this synthetic dataset.
Dataset Splits	No	The paper mentions conducting numerical experiments on a synthetic dataset but does not specify any training, validation, or test splits (e.g., percentages or sample counts).
Hardware Specification	No	The paper describes numerical experiments on synthetic data, but it does not provide any specific details about the hardware used (e.g., GPU models, CPU types, or cloud resources).
Software Dependencies	No	The paper mentions comparing methods like SVRG, Katyusha X, SVRP, and Acc EG, but it does not specify any software dependencies (e.g., programming languages, libraries, or solvers) with version numbers.
Experiment Setup	Yes	To capture the differences in convergence rates between our methods and SVRP caused by different magnitudes of µ, we vary µ = 10 i, i {0, 1, 2}. We compare our methods (SVRS and Acc SVRS) against SVRG, Katyusha X, SVRP (Catalyzed SVRP is somehow hard to tune so we omit it), and Accelerated Extragradient (Acc EG) using their theoretical step sizes, except that we scale the interpolation parameter τ in Katyusha X and Acc SVRS for producing practical performance (see Appendix G for detail).