Fast Regression for Structured Inputs

Authors: Raphael A Meyer, Cameron N Musco, Christopher P Musco, David Woodruff, Samson Zhou

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical evidence to validate our core statistical claim about Vandermonde regression: that the relative error ε achieved by ℓp/2r Lewis Weight subsampling is polynomial in p, and not exponential in p.
Researcher Affiliation Academia Raphael A. Meyer New York University ram900@nyu.edu Cameron Musco University of Massachusetts Amherst cmusco@cs.umass.edu Christopher Musco New York University cmusco@nyu.edu David P. Woodruff Carnegie Mellon University dwoodruf@andrew.cmu.edu Samson Zhou Carnegie Mellon University samsonzhou@gmail.com
Pseudocode Yes Fig. 1: Iterative algorithm for approximate ℓp Lewis weights with p < 4. Algorithm 1 Round Trunc: Round and truncate coordinates of input vector b Algorithm 2 Faster regression for Vandermonde matrices Algorithm 3 Faster ℓp regression for general matrices
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets No To validate our theory, we plot log(εempirical) versus p and m on synthetic data. Specifically, we generate n = 25, 000 i.i.d. N(0, 1) times samples to form a Vandermonde matrix V with d = 20 columns, then compute the polynomial q(t) = t10 at each time sample, add N(0, 1010) additive noise, and save the corresponding values in b.
Dataset Splits No The paper describes generating synthetic data for experiments but does not mention specific train/validation/test dataset splits.
Hardware Specification Yes Figure 2 shows the result of these experiments, which were run in Julia 1.6.1, on Windows 10 with an Intel i7-7700K CPU and 16Gb RAM.
Software Dependencies Yes Figure 2 shows the result of these experiments, which were run in Julia 1.6.1, on Windows 10 with an Intel i7-7700K CPU and 16Gb RAM.
Experiment Setup Yes To validate our theory, we plot log(εempirical) versus p and m on synthetic data. Specifically, we generate n = 25, 000 i.i.d. N(0, 1) times samples to form a Vandermonde matrix V with d = 20 columns, then compute the polynomial q(t) = t10 at each time sample, add N(0, 1010) additive noise, and save the corresponding values in b. ... In Figure 2a, we fix m = 1000 and vary p [2, 25]. ... in Figure 2b, we fix p = 6 and vary m [100, 3000]. ... For this test, we fix n = 25, 000, d = 10, and p = 6, while varying m [1, 1000].