Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

Authors: Jonathan Lacotte, Mert Pilanci

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Further, we show numerically that it outperforms standard iterative solvers such as the conjugate gradient method and its pre-conditioned version on several standard machine learning datasets. Numerical simulations were carried out on a 512Gb desktop station and implemented in Python using its standard numerical linear algebra modules3.
Researcher Affiliation Academia Jonathan Lacotte Department of Electrical Engineering Stanford University lacotte@stanford.edu Mert Pilanci Department of Electrical Engineering Stanford University pilanci@stanford.edu
Pseudocode Yes Algorithm 1: Adaptive Polyak-IHS method.
Open Source Code Yes Code is publicly available at https://github.com/jonathanlctt/eff_dim_solver
Open Datasets Yes We present in Figures 1 and 2 results for two standard datasets (see Appendix A.1 for additional experiments): (i) one-vs-all classification of MNIST digits and (ii) one-vs-all classification of CIFAR10 images.
Dataset Splits No The paper mentions using standard datasets like MNIST and CIFAR10 but does not specify train/validation/test splits or a validation set.
Hardware Specification No The paper states "Numerical simulations were carried out on a 512Gb desktop station", but does not specify CPU or GPU models, or other detailed hardware specifications.
Software Dependencies No The paper states "implemented in Python using its standard numerical linear algebra modules", but does not provide specific version numbers for Python or any libraries used.
Experiment Setup Yes Algorithm 1 provides inputs such as "initial sketch size m 1, initial points x0, x1 Rd, target convergence rates cgd, cp (0, 1), gradient descent step size µgd, Polyak step size µp 0 and momentum parameter βp 0". Also, in section 5, it mentions "For each value of ν, we stop the algorithm once ε = 10^-10-precision is reached."