An Iterative, Sketching-based Framework for Ridge Regression

Authors: Agniva Chowdhury, Jiasen Yang, Petros Drineas

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluations verify our theoretical results on both real and synthetic data.4. Empirical EvaluationWe perform experiments on the ARCENE dataset (Guyon et al., 2005) from the UCI repository (Lichman, 2013). The design matrix contains 200 samples with 10, 000 real-valued features; we normalize the entries to be within the interval [0, 1]. The response vector consists of ±1 labels. We also perform experiments on synthetic data generated as in Chen et al. (2015); see Appendix H for details.In our experiments, we compare three different choices of sampling probabilities: selecting columns (i) uniformly at random, (ii) proportional to their leverage scores, or (iii) proportional to their ridge leverage scores. For each sampling method, we run Algorithm 1 for 50 iterations with a variety of sketch sizes, and measure (i) the relative error of the solution vector bx x 2 x 2 , where x is the true optimal solution and (ii) the objective sub-optimality f(bx ) f(x ) 1, where f(x) = Ax b 2 2 + λ x 2 2 is the objective function for the ridge-regression problem.The results are shown in Figure 1. Figures 1a and 1b plot the relative error of the solution vector and the objective suboptimality (for a fixed sketch size) as the iterative algorithm progresses. Figure 1c plots the relative error of the solution with respect to varying sketch sizes (the plots for objective sub-optimality are analogous and thus omitted). We observe that both the solution error and the objective sub-optimality decay exponentially as our iterative algorithm progresses.4
Researcher Affiliation Academia Agniva Chowdhury 1 Jiasen Yang 1 Petros Drineas 21Department of Statistics, Purdue University, West Lafayette, IN 2Department of Computer Science, Purdue University, West Lafayette, IN.
Pseudocode Yes Algorithm 1 Iterative, sketching-based ridge regressionInput: A Rn d, b Rn, λ > 0; number of iterations t > 0; sketching matrix S Rd s;Initialize: b(0) b, ex(0) 0d, y(0) 0n; for j = 1 to t dob(j) b(j 1) λy(j 1) Aex(j 1); y(j) (ASSTAT + λIn) 1b(j); ex(j) ATy(j); end for Output: Approximate solution vector bx = Pt j=1 ex(j);
Open Source Code No The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available.
Open Datasets Yes We perform experiments on the ARCENE dataset (Guyon et al., 2005) from the UCI repository (Lichman, 2013). The design matrix contains 200 samples with 10, 000 real-valued features; we normalize the entries to be within the interval [0, 1]. The response vector consists of ±1 labels. We also perform experiments on synthetic data generated as in Chen et al. (2015); see Appendix H for details.
Dataset Splits No The paper states it uses the ARCENE dataset and synthetic data, and mentions 'design matrix' and 'response vector'. It discusses calculating 'relative error of the solution vector' and 'objective sub-optimality' on the data, but it does not specify how the data was split into training, validation, or test sets with percentages or sample counts.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instances).
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup Yes For these experiments, we have set the regularization parameter λ = 10 in the ridge regression objective as well as when computing the ridge leverage score sampling probabilities.