reproducibilityindex.ai

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update

Authors: Michal Derezinski, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated our theory on a range of different problems, and we have found that the more precise analysis that our theory provides describes well the convergence behavior for a range of optimization problems. In this section, we present numerical simulations illustrating this for regularized logistic regression and least squares regression, with different datasets ranging from medium to large scale: the CIFAR-10 dataset, the Musk dataset, and WESAD [SRD+18].
Researcher Affiliation	Academia	Michał Dereziński Department of Statistics University of California, Berkeley mderezin@berkeley.edu Jonathan Lacotte Department of Electrical Engineering Stanford University lacotte@stanford.edu Mert Pilanci Department of Electrical Engineering Stanford University pilanci@stanford.edu Michael W. Mahoney ICSI and Department of Statistics University of California, Berkeley mmahoney@stat.berkeley.edu
Pseudocode	No	The paper describes mathematical derivations and algorithms but does not present them in a pseudocode or explicitly labeled algorithm block format.
Open Source Code	No	The paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We present numerical simulations illustrating this for regularized logistic regression and least squares regression, with different datasets ranging from medium to large scale: the CIFAR-10 dataset, the Musk dataset, and WESAD [SRD+18].
Dataset Splits	Yes	For each dataset, we choose the value of λ among {10 j \| j = 0, . . . , 8} that minimizes the error on a hold out validation set.
Hardware Specification	No	The paper states 'see Appendix E for hardware details', but Appendix E is not provided in the submitted text, thus specific hardware models or specifications cannot be determined.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments.
Experiment Setup	Yes	We use a sketch size m = d/2 for NS. In the bottom plots, we report the CPU and GPU wall-clock times to reach a 10 6 accurate solution for NS with different sketching methods. For each dataset, we choose the value of λ among {10 j \| j = 0, . . . , 8} that minimizes the error on a hold out validation set. For CIFAR-10 and Musk, we pick λ = 10 4. For WESAD, we pick λ = 10 5.