Stochastic Variance-Reduced Cubic Regularized Newton Methods
Authors: Dongruo Zhou, Pan Xu, Quanquan Gu
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Thorough experiments on various non-convex optimization problems support our theory. [...] In this section, we present numerical experiments on different non-convex Empirical Risk Minimization (ERM) problems and on different datasets to validate the advantage of our SVRC algorithm in finding approximate local minima. [...] From Figures 1, 2 and 3, we can see that our algorithm SVRC outperforms all the other baseline algorithms on all the datasets. |
| Researcher Affiliation | Academia | Dongruo Zhou 1 Pan Xu 1 Quanquan Gu 1 1Department of Computer Science, University of California, Los Angeles, CA 90095, USA. Correspondence to: Quanquan Gu <qgu@cs.ucla.edu>. |
| Pseudocode | Yes | Algorithm 1 Stochastic Variance Reduction Cubic Regularization (SVRC) 1: Input: batch size bg, bh, cubic penalty parameter {Ms,t}, epoch number S, epoch length T and starting point x0. 2: Initialization bx1 = x0 3: for s = 1, . . . , S do... |
| Open Source Code | No | The paper does not provide any specific links to open-source code or explicitly state that the code for their method is publicly available. |
| Open Datasets | Yes | The datasets we use are a9a, covtype, ijcnn1, which are common datasets used in ERM problems. The detailed information about these datasets are in Table 2. (Table 2 lists 'a9a', 'covtype', 'ijcnn1' with sample size and dimension). |
| Dataset Splits | No | The paper does not explicitly provide details about training, validation, or test splits for the datasets used in the experiments. |
| Hardware Specification | No | The paper does not specify any hardware used for the experiments, such as CPU or GPU models, or other specific machine configurations. |
| Software Dependencies | No | The paper mentions using a 'Lanczos-type method' for the subproblem solver but does not provide specific software names with version numbers for any dependencies. |
| Experiment Setup | Yes | Parameters and subproblem solver: For each algorithm and each dataset, we choose different bg, bh, T for the best performance. Meanwhile, we choose Ms,t = /(1 + β)(s+t/T ), , β > 0 for each iteration. [...] We set = 0.05, β = 0 for a9a and ijcnn1 datasets and = 5e3, β = 0.15 for covtype. |