Regularized Nonlinear Acceleration

Authors: Damien Scieur, Alexandre d'Aspremont, Francis Bach

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments are detailed on classical classification problems. We test our methods on a regularized logistic regression problem written f(w) = Pm i=1 log 1 + exp( yiξT i w) + τ where Ξ = [ξ1, ..., ξm]T Rm n is the design matrix and y is a { 1, 1}n vector of labels. We used the Madelon UCI dataset, setting τ = 102 (in order to have a ratio L/τ equal to 109). We solve this problem using several algorithms, the fixed-step gradient method for strongly convex functions [6, Th. 2.1.15] using stepsize 2/(L + µ), where L = Ξ 2 2/4 + τ and µ = τ, the accelerated gradient method for strongly convex functions [6, Th. 2.2.3] and our nonlinear acceleration of the gradient method iterates using RMPE in Proposition 2.6 with restarts. The results are reported in Figure 1.
Researcher Affiliation Academia Damien Scieur INRIA & D.I., UMR 8548, École Normale Supérieure, Paris, France. damien.scieur@inria.fr Alexandre d Aspremont CNRS & D.I., UMR 8548, École Normale Supérieure, Paris, France. aspremon@di.ens.fr Francis Bach INRIA & D.I., UMR 8548, École Normale Supérieure, Paris, France. francis.bach@inria.fr
Pseudocode Yes Algorithm 1 Regularized Approximate Minimal Polynomial Extrapolation (RMPE) Input: Sequence {x0, x1, ..., xk+1}, parameter λ > 0 Compute U = [x1 x0, ..., xk+1 xk] Solve the linear system (U T U + λI)z = 1 Set c = z/(z T 1) Output: Pk i=0 cixi, the approximation of the fixed point x
Open Source Code No The paper does not provide any concrete access to source code for the methodology described. There are no links to repositories or explicit statements about code release.
Open Datasets Yes We used the Madelon UCI dataset, setting τ = 102 (in order to have a ratio L/τ equal to 109).
Dataset Splits No The paper mentions using the Madelon UCI dataset but does not specify any training, validation, or test splits in terms of percentages, sample counts, or predefined citations. Therefore, information needed to reproduce data partitioning is not explicitly provided.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments. It only mentions "CPU Time (sec.)" in Figure 1, but no specifications.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup Yes We used the Madelon UCI dataset, setting τ = 102 (in order to have a ratio L/τ equal to 109). We solve this problem using several algorithms, the fixed-step gradient method for strongly convex functions [6, Th. 2.1.15] using stepsize 2/(L + µ), where L = Ξ 2 2/4 + τ and µ = τ, the accelerated gradient method for strongly convex functions [6, Th. 2.2.3] and our nonlinear acceleration of the gradient method iterates using RMPE in Proposition 2.6 with restarts. This last algorithm is implemented as follows. We do k steps (in the numerical experiments, k is typically equal to 5) of the gradient method, then extrapolate a solution X c λ where c λ is computed by solving the RMPE problem (17) on the gradient iterates X, with regularization parameter λ determined by a grid search.