Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods

Authors: Kimon Antonakopoulos, Ali Kavis, Volkan Cevher

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we will present practical performance of EXTRA-NEWTON against a set of first-order algorithms, e.g., GD, SGD, ADAGRAD [20], ACCELEGRAD [41], UNIXGRAD [35]; and second-order methods, e.g., NEWTON S, Optimal Monteiro-Svaiter (OPTMS) [13], Cubic Regularization of Newton s method (CRN) [58] and Accelerated CRN (ACRN) [54] for least squares and logistic regression problems over a LIBSVM datasets, a1a and a9a.
Researcher Affiliation Academia Kimon Antonakopoulos EPFL (LIONS) kimon.antonakopoulos@epfl.ch Ali Kavis EPFL (LIONS) ali.kavis@epfl.ch Volkan Cevher EPFL (LIONS) volkan.cevher@epfl.ch
Pseudocode Yes Algorithm 1: EXTRA-NEWTON
Open Source Code No Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets Yes for least squares and logistic regression problems over a LIBSVM datasets, a1a and a9a.
Dataset Splits No The paper mentions using LIBSVM datasets but does not provide specific training/validation/test split percentages, sample counts, or explicit splitting methodology.
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes For the logistic regression problem, we regularize it with g(x) = 1/2 x 2, but use a very small regularization constant to render the problem ill-conditioned, making things slightly more difficult for the algorithms [47, 48]. Although we implement NEWTON S with line-search, we actually observed a sporadic convergence behavior; when the initial point is close to the solution it converges similarly to EXTRA-NEWTON, however when we initialize further away it doesn t converge. This non-convergent behavior has been known for NEWTON S, even with line-search present [27]. On the contrary, EXTRA-NEWTON consistently converges; even if we perturb the initial step-size and make it adversarially large, it manages to recover due to its adaptive step-size.