Distributed Accelerated Proximal Coordinate Gradient Methods

Authors: Yong Ren, Jun Zhu

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the regularized empirical risk minimization problem demonstrate the effectiveness of our algorithm and match our theoretical findings.
Researcher Affiliation Academia Yong Ren, Jun Zhu Center for Bio-Inspired Computing Research State Key Lab for Intell. Tech. & Systems Dept. of Comp. Sci. & Tech., TNList Lab, Tsinghua University renyong15@mails.tsinghua.edu.cn; dcszj@tsinghua.edu.cn
Pseudocode Yes Algorithm 1 The Dis APCG algorithm. Algorithm 2 Dis APCG without full-dimensional vector operators in the case µκ > 0 Algorithm 3 Dis APCG for regularized ERM with µ > 0
Open Source Code No The paper states 'We implement the algorithms by C++ and open MPI' but does not provide any link or explicit statement about making their code available.
Open Datasets Yes Experiments are performed on 3 datasets from [Fan and Lin, 2011] whose information is summarized in Table 1. [Fan and Lin, 2011] is cited as 'Libsvm data: Classification, regression and multi-label. URL: http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets, 2011.'
Dataset Splits No The paper mentions using datasets but does not provide specific details on how the data was split into training, validation, or test sets (e.g., percentages, sample counts, or predefined splits).
Hardware Specification No The paper states: 'We implement the algorithms by C++ and open MPI and run them in clusters on Tianhe-II super computer, where in each node we use a single cpu.' While 'Tianhe-II super computer' is mentioned, it lacks specific CPU/GPU models, memory, or more detailed system specifications.
Software Dependencies No The paper mentions 'C++ and open MPI' as implementation details but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We either vary the mini-batch size τ on each node with the number of nodes K fixed, or vise versa. For a fair comparison, we set the mini-batch size to be τ = 102 for our Dis APCG method and Dis DCA. We vary λ from 10 6 to 10 8, which is a relatively hard setting since the strong convexity parameter is small. For all settings, we use K = 16 nodes.