Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee

Authors: Shen-Yi Zhao, Wu-Jun Li

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both theoretical and empirical results show that Asy SVRG can outperform existing state-of-the-art parallel SGD methods like Hogwild! in terms of convergence rate and computation cost. Experiment We choose logistic regression (LR) with a L2-norm regularization term to evaluate our Asy SVRG.
Researcher Affiliation Academia Shen-Yi Zhao and Wu-Jun Li National Key Laboratory for Novel Software Technology Department of Computer Science and Technology, Nanjing University, China zhaosy@lamda.nju.edu.cn, liwujun@nju.edu.cn
Pseudocode Yes Our Asy SVRG algorithm is presented in Algorithm 1. Algorithm 1 Asy SVRG
Open Source Code No The paper does not provide explicit statements or links for the open-source code of the described methodology.
Open Datasets Yes Four datasets are used for evaluation. They are rcv1, realsim, news20, and epsilon, which can be downloaded from the Lib SVM website3. 3http://www.csie.ntu.edu.cn/~cjlin/libsvmtools/datasets/
Dataset Splits No The paper mentions using training data but does not provide specific details on how the datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or predefined splits).
Hardware Specification Yes The experiments are conducted on a server with 12 Intel cores and 64G memory.
Software Dependencies No The paper does not provide specific software dependencies or library versions (e.g., Python, PyTorch, or specific solvers with version numbers) used for the experiments.
Experiment Setup Yes we simply set the hyper-parameter λ = 10 4 in f(w) for all the experiments. and We set M in Algorithm 1 to be 2n/p, where n is the number of training instances and p is number of threads. and For Hogwild!, in each epoch, each thread runs n/p iterations. We use a constant step size γ, and we set γ ← 0.9γ after every epoch.