reproducibilityindex.ai

Optimal Randomized First-Order Methods for Least-Squares Problems

Authors: Jonathan Lacotte, Mert Pilanci

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	First, we generate a synthetic dataset with n = 8192, d = 1600 and m {1700, 3500, 5700}. Although our results are universal in the sense that it does not depend on the spectral decay of the matrix A, we still consider a challenging setting for ﬁrst-order methods, that is, we generate an n d matrix A with an exponential spectral decay. In Figure 2, we verify numerically that Algorithm 2 is faster than the best Heavy-ball method with refreshed SRHT sketches ( SRHT (refreshed) ), and than Algorithm 1. Further, we compare Algorithm 2 to the Heavy-ball method with ﬁxed SRHT embedding whose parameters are found based on edge eigenvalues analysis, using either our new density fh ( SRHT (edge eig.) ) as described previously in Section 2.2.1 , or, the previous bounds derived by (Tropp, 2011) ( SRHT (baseline) ). As predicted, Algorithm 2 performs very similarly to the former, and better than the latter. We mention that we use small perturbations of the algorithmic parameters derived from our asymptotic analysis. Following the notations introduced in Theorem 1, instead of at and bt, we use aδ t = (1 + δ)at and bδ t = (1 δ)bt with δ = 0.01. These conservative perturbations are necessary in practice due to the ﬁnite-sample approximations. We defer a detailed description of the experimental setup to Appendix C. Second, we verify that our predicted convergence rates for Algorithms 1 and 2 are matched empirically, on Figure 3. For this purpose, we generate a synthetic dataset with n = 8192, d {500, 1250, 2000} and varying sketching size, with a data matrix having an exponential spectral decay. Lastly, we verify our results on standard machine learning benchmarks, that is, we consider the MNIST and CIFAR10 datasets, for which results are respectively reported on Figures 4 and 5.
Researcher Affiliation	Academia	Jonathan Lacotte 1 Mert Pilanci 1 1Department of Electrical Engineering, Stanford University. Correspondence to: Jonathan Lacotte <lacotte@stanford.edu>.
Pseudocode	Yes	Algorithm 1 Optimal First-Order Method for Gaussian embeddings. Algorithm 2 Optimal First-Order Method for SRHT (or Haar) embeddings.
Open Source Code	No	The paper does not provide any explicit statements about open-source code availability or links to a code repository for the methodology described.
Open Datasets	Yes	Lastly, we verify our results on standard machine learning benchmarks, that is, we consider the MNIST and CIFAR10 datasets, for which results are respectively reported on Figures 4 and 5.
Dataset Splits	No	The paper mentions using well-known datasets but does not provide specific details on how these datasets were split into training, validation, and test sets (e.g., percentages, sample counts, or citations to standard splits).
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU, GPU models, or cloud computing resources) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup	Yes	We mention that we use small perturbations of the algorithmic parameters derived from our asymptotic analysis. Following the notations introduced in Theorem 1, instead of at and bt, we use aδ t = (1 + δ)at and bδ t = (1 δ)bt with δ = 0.01. These conservative perturbations are necessary in practice due to the ﬁnite-sample approximations. We defer a detailed description of the experimental setup to Appendix C.