Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

Authors: Shusen Wang, Alex Gittens, Michael W. Mahoney

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, sketched MRR solutions can have risks that are higher by an order-of-magnitude than those of the optimal MRR solutions. We establish theoretically and empirically that model averaging greatly decreases this gap.Our empirical evaluations bear out these theoretical results. In particular, in Section 4, we show in Figure 2 that even when the regularization parameter is fine-tuned, the risks of classical sketch and Hessian sketch are worse than that of the optimal solution by an order of magnitude.
Researcher Affiliation Academia 1International Computer Science Institute and Department of Statistics, University of California at Berkeley, USA 2Department of Computer Science, Rensselaer Polytechnic Institute, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements or links indicating the availability of open-source code for the methodology described.
Open Datasets No We constructed X Rn d to have condition number κ(XT X) = 1012 and high row coherence, fixed w0 = [10.2d; 0.110.6d; 10.2d], and set y = Xw0 +ε Rn, where the entries of ε Rn were i.i.d. sampled from N(0, ξ2). The details of this data model are given in the technical report version (Wang et al., 2017).
Dataset Splits No The paper describes generating synthetic data but does not specify any training/test/validation dataset splits or cross-validation setup.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We fix n = 105, d = 500, and s = 5, 000. Because the analytical expressions involve the random sketching matrix S, we randomly generate S, repeat this procedure 10 times, and report the averaged results.We set the noise intensity to be = 0.1.