An Iterative, Sketching-based Framework for Ridge Regression
Authors: Agniva Chowdhury, Jiasen Yang, Petros Drineas
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluations verify our theoretical results on both real and synthetic data.4. Empirical EvaluationWe perform experiments on the ARCENE dataset (Guyon et al., 2005) from the UCI repository (Lichman, 2013). The design matrix contains 200 samples with 10, 000 real-valued features; we normalize the entries to be within the interval [0, 1]. The response vector consists of ±1 labels. We also perform experiments on synthetic data generated as in Chen et al. (2015); see Appendix H for details.In our experiments, we compare three different choices of sampling probabilities: selecting columns (i) uniformly at random, (ii) proportional to their leverage scores, or (iii) proportional to their ridge leverage scores. For each sampling method, we run Algorithm 1 for 50 iterations with a variety of sketch sizes, and measure (i) the relative error of the solution vector bx x 2 x 2 , where x is the true optimal solution and (ii) the objective sub-optimality f(bx ) f(x ) 1, where f(x) = Ax b 2 2 + λ x 2 2 is the objective function for the ridge-regression problem.The results are shown in Figure 1. Figures 1a and 1b plot the relative error of the solution vector and the objective suboptimality (for a fixed sketch size) as the iterative algorithm progresses. Figure 1c plots the relative error of the solution with respect to varying sketch sizes (the plots for objective sub-optimality are analogous and thus omitted). We observe that both the solution error and the objective sub-optimality decay exponentially as our iterative algorithm progresses.4 |
| Researcher Affiliation | Academia | Agniva Chowdhury 1 Jiasen Yang 1 Petros Drineas 21Department of Statistics, Purdue University, West Lafayette, IN 2Department of Computer Science, Purdue University, West Lafayette, IN. |
| Pseudocode | Yes | Algorithm 1 Iterative, sketching-based ridge regressionInput: A Rn d, b Rn, λ > 0; number of iterations t > 0; sketching matrix S Rd s;Initialize: b(0) b, ex(0) 0d, y(0) 0n; for j = 1 to t dob(j) b(j 1) λy(j 1) Aex(j 1); y(j) (ASSTAT + λIn) 1b(j); ex(j) ATy(j); end for Output: Approximate solution vector bx = Pt j=1 ex(j); |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | We perform experiments on the ARCENE dataset (Guyon et al., 2005) from the UCI repository (Lichman, 2013). The design matrix contains 200 samples with 10, 000 real-valued features; we normalize the entries to be within the interval [0, 1]. The response vector consists of ±1 labels. We also perform experiments on synthetic data generated as in Chen et al. (2015); see Appendix H for details. |
| Dataset Splits | No | The paper states it uses the ARCENE dataset and synthetic data, and mentions 'design matrix' and 'response vector'. It discusses calculating 'relative error of the solution vector' and 'objective sub-optimality' on the data, but it does not specify how the data was split into training, validation, or test sets with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instances). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | For these experiments, we have set the regularization parameter λ = 10 in the ridge regression objective as well as when computing the ridge leverage score sampling probabilities. |