reproducibilityindex.ai

DiSCO: Distributed Optimization for Self-Concordant Empirical Loss

Authors: Yuchen Zhang, Xiao Lin

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct numerical experiments to compare the Di SCO algorithm with several state-of-the-art distributed optimization algorithms: the ADMM algorithm (Boyd et al., 2010), the accelerated full gradient method (AFG) (Nesterov, 2004), the L-BFGS quasi Newton method (Nocedal & Wright, 2006, Section 7.2), and the DANE algorithm (Shamir et al., 2014). For comparison, we solve three binary classiﬁcation tasks using logistic regression. The datasets are obtained from the LIBSVM datasets (Chang & Lin, 2011) and summarized in Table 2. Figure 2. Comparing Di SCO with other distributed optimization algorithms.
Researcher Affiliation	Collaboration	Yuchen Zhang YUCZHANG@EECS.BERKELEY.EDU University of California Berkeley, Berkeley, CA 94720, USA Lin Xiao LIN.XIAO@MICROSOFT.COM Microsoft Research, Redmond, WA 98053, USA
Pseudocode	Yes	Algorithm 1 Inexact damped Newton method Algorithm 2 Distributed PCG algorithm Algorithm 3 Di SCO
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	For comparison, we solve three binary classiﬁcation tasks using logistic regression. The datasets are obtained from the LIBSVM datasets (Chang & Lin, 2011) and summarized in Table 2.
Dataset Splits	Yes	We splits each dataset evenly to m machines, with m {4, 16, 64}.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	Yes	For Algorithm 3, we choose the input parameters µ = m1/2µ0, where µ0 is manually chosen to be µ0 = 0 for Covtype, µ0 = 4 10 4 for RCV1, and µ0 = 2 10 4 for News20. The regularization parameter is set to be γ = 10 5. For fair comparison, we manually tune the penalty parameter of ADMM and the regularization parameter µ for DANE to optimize their performance. For AFG, we used an adaptive line search scheme (Nesterov, 2013) to speed up its convergence. For L-BFGS, we adopted the memory size 30, as suggested in Nocedal & Wright (2006).