Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning
Authors: Di Wang, Jinhui Xu
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases. |
| Researcher Affiliation | Academia | Di Wang Department of Computer Science and Engineering State University of New York at Buffalo Jinhui Xu Department of Computer Science and Engineering State University of New York at Buffalo |
| Pseudocode | Yes | Algorithm 2 HDpw Batch SGD(A,b,x0,T,r,η,s) Input: x0 is the initial point, r is the batch size, η is the fixed step size, s is the sketch size, and T is the iteration number. 1: Compute R Rd d which makes AR 1 an (O(d), O(1), 2)-conditioned basis of A as in Algorithm 1 by using a sketch matrix S with size s n. 2: Compute HDA and HDb, where HD is a Randomized Hadamard Transform. 3: for t 1, . . . T do 4: Randomly sample an indices set τt of size r, where each index in τt is i.i.d uniformly sampled. 5: cτt = 2n j τt (HDA)T j [(HDA)jxt 1 (HDb)j] = 2n r (HDA)T τt[(HDA)τtxt 1 (HDb)τt] 6: xt = arg minx W 1 2 R(xt 1 x) 2 2 + η cτt, x = PW(xt 1 ηR 1(R 1)T cτt) 7: end for 8: return xavg T = |
| Open Source Code | No | The paper only provides links to third-party code used, not the source code for the methodology described in this paper. |
| Open Datasets | Yes | The datasets Year1 and Buzz2 come from UCI Machine Learning Repository (Lichman 2013). |
| Dataset Splits | No | The paper does not explicitly provide details about training, validation, or test dataset splits or cross-validation setup. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments. |
| Software Dependencies | No | The paper mentions software like MATLAB, Count Sketch, SGD, and Adagrad but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We use the step size as described in Theorem 2 (note that we assume that the step size is already known in advance). pw Gradient, i.e. Algorithm 4. We set η = 1/2 as the step size. In each experiment, the initial value x0 is set to be the zero vector. |