reproducibilityindex.ai

Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization

Authors: Pan Zhou, Xiao-Tong Yuan

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we carry out experiments to compare the numerical performance of HSDMPG with several representative stochastic gradient optimization algorithms, including SGD (Robbins & Monro, 1951), SVRG (Johnson & Zhang, 2013), APCG (Lin et al., 2014), Katyusha (Allen Zhu, 2017) and SCSG (Lei & Jordan, 2017). We evaluate all the considered algorithms on two sets of strongly-convex learning tasks. The first set is for ridge regression with least squared loss... In the second setting we consider two classification models: logistic regression... and multi-class softmax regression... We run simulations on ten datasets whose details are described in Appendix D.4. ... Figure 1: Single-epoch processing: stochastic gradient algorithms process data a single pass on quadratic problems. ... Figure 2: Multi-epoch processing: stochastic gradient algorithms process data multiple pass on quadratic problems. ... Figure 3: Multi-epoch processing (about 8 epochs): stochastic gradient algorithms process data multiple pass on logistic regression problems (ijcnn and w08) and softmax regression problems (protein and letter).
Researcher Affiliation	Collaboration	1Salesforce Research 2 B-DAT Lab and CICAEET, Nanjing University of Information Science & Technology, Nanjing, 210044, China. Correspondence to: Xiao-Tong Yuan <xtyuan@nuist.edu.cn>.
Pseudocode	Yes	Algorithm 1 Hybrid Stochastic-Deterministic Minibatch Proximal Gradient (HSDMPG) for quadratic loss. ... Algorithm 2 Hybrid Stochastic-Deterministic Minibatch Proximal Gradient (HSDMPG) on the generic loss.
Open Source Code	No	The paper does not provide a direct statement or link for the open-source code of the described methodology.
Open Datasets	Yes	We run simulations on ten datasets whose details are described in Appendix D.4. ... Appendix D.4 Datasets Details: We conduct experiments on ten datasets from LIBSVM (Chang & Lin, 2011) and UCI (Dua & Graff, 2017) repository, which cover a wide range of applications and data properties.
Dataset Splits	No	The paper mentions training and test sets (e.g., for 'ijcnn1': "The training set consists of 49,040 samples with 22 features, and the test set consists of 9,869 samples"), but does not provide specific details for a validation set or explicit split percentages for training, validation, and test.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using specific algorithms (e.g., SVRG) and datasets from LIBSVM and UCI, but it does not specify any programming languages, libraries, or frameworks with their version numbers that were used for implementation.
Experiment Setup	Yes	For HSDMPG, we set the size s of S around n0.75. For the minibatch for inner problems, we set initial minibatch size \|S1\| = 50 and then follow our theory to exponentially expand size of St with proper exponential rate. The regularization constant in the subproblem (3) is set to be γ = p log(d)/s as suggested by our theory. The optimization error εt in (3) is controlled by respectively allowing SVRG to run 3 epochs and 10 epochs on the two sets of tasks. Similarly, we control the optimization error ε t in (5) by running SVRG with 3 epochs. ... we set the regularization parameter µ = 0.01 to make the quadratic problems well-conditioned. ... Here we reset the regularization strength parameter in quadratic problems as µ = 10 4 for generating more challenging optimization tasks. ... their regularization modulus parameters are set as µ = 0.01.