Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity

Authors: Conghui Tan, Tong Zhang, Shiqian Ma, Ji Liu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments suggest that our methods are faster than existing ones such as proximal SGD, SVRG and SAGA on high-dimensional problems.
Researcher Affiliation Collaboration Conghui Tan The Chinese University of Hong Kong chtan@se.cuhk.edu.hk Tong Zhang Tencent AI Lab tongzhang@tongzhang-ml.org Shiqian Ma University of California, Davis sqma@math.ucdavis.edu Ji Liu Tencent AI Lab, University of Rochester ji.liu.uwisc@gmail.com
Pseudocode Yes Our SPD1 algorithm for solving (5) is presented in Algorithm 1. ... This new algorithm, named SPD1-VR, is presented in Algorithm 2.
Open Source Code No The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets Yes We will test all the algorithms on three real datasets: colon-cancer, gisette and rcv1.binary, downloaded from the LIBSVM website 2. The attributes of these data and λ used for each dataset are summarized in Table 1. 2www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets
Dataset Splits No The paper does not explicitly mention training/validation/test dataset splits, specific percentages, or how data was partitioned for validation.
Hardware Specification No The paper does not explicitly state the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions 'single-machine setting'.
Software Dependencies No The paper mentions applying Newton's method for computing proximal mappings but does not specify any software names with version numbers (e.g., libraries, frameworks, programming languages with versions) used for the experiments.
Experiment Setup Yes We always set T = nd for SPD1-VR and T = n for SVRG, where T is the number of inner loops in each outer loop. ... λ is fixed as λ = 10 3. ... The attributes of these data and λ used for each dataset are summarized in Table 1.