Convex optimization based on global lower second-order models

Authors: Nikita Doikov, Yurii Nesterov

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 7 contains numerical experiments. ... We see, that for bigger D, it becomes harder to solve the optimization problem. Second-order methods demonstrate good performance both in terms of the iterations, and the total computational time. ... In the next set of experiments, we compare the basic stochastic version of our method, using estimators (25) SNewton, the method with the variance reduction (Algorithm 4) SVRNewton, and first-order algorithms (with constant step-size, tuned for each problem): SGD and SVRG [21].
Researcher Affiliation Academia Nikita Doikov Catholic University of Louvain, Louvain-la-Neuve, Belgium Nikita.Doikov@uclouvain.be Yurii Nesterov Catholic University of Louvain, Louvain-la-Neuve, Belgium Yurii.Nesterov@uclouvain.be
Pseudocode Yes Algorithm 1: Contracting-Domain Newton Method, I ... Algorithm 2: Contracting-Domain Newton Method, II ... Algorithm 3: Aggregating Newton Method ... Algorithm 4: Stochastic Variance-Reduced Contracting-Domain Newton
Open Source Code Yes The source code can be found at https://github.com/doikov/contracting-newton/
Open Datasets Yes The composite part is given by (4), with p = 2. ... determined by the dataset4. 4https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits, percentages, or absolute sample counts needed to reproduce the experiment.
Hardware Specification Yes Clock time was evaluated using the machine with Intel Core i5 CPU, 1.6GHz; 8 GB RAM.
Software Dependencies No The paper states 'All methods were implemented in C++', but does not provide specific version numbers for compilers, libraries, or other software dependencies.
Experiment Setup No The paper mentions 'constant step-size, tuned for each problem' for some algorithms, but it does not provide specific hyperparameter values like learning rates, batch sizes, number of epochs, or other detailed training configurations for reproducibility.