Robust Partially-Compressed Least-Squares

Authors: Stephen Becker, Ban Kawas, Marek Petrik

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results, show that both partiallycompressed and robust partially-compressed solutions can outperform models that use full compression in terms of quality of solutions. We also show that compressed variants are more computationally efficient than ordinary least-squares, especially as dimensions grow. ... Results in Fig. 1 which we explain in detail in the experimental section demonstrate the main issue with standard methods as well as how we alleviate it. ... Empirical Results Our focus in this section is on the improvement of the solution error in comparison with the non-compressed least squares solution as well as the improvement over regular full compression. We also investigate the computational speed of the algorithms and show that partial compression is just as fast as full compression (and hence sometimes faster than standard least-squares), and that robust partial compression is only roughly twice as slow (and asymptotically it is the same cost).
Researcher Affiliation Collaboration Stephen Becker University of Colorado Boulder stephen.Becker@colorado.edu Ban Kawas IBM T.J. Watson Research Center bkawas@us.ibm.com Marek Petrik University of New Hampshire mpetrik@cs.unh.edu
Pseudocode Yes Algorithm 1: Efficient Algorithm for Solving RPC
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets Yes We first use a data set from the National Health Interview Survey from 1992, containing 44085 rows and only 9 columns; since it is highly overcomplete and contains potentially sensitive data, it is a good candidate for sketching.
Dataset Splits Yes To test over this, we do 100 realizations of 5000 randomly chosen training data and 10000 testing data, and for each realization draw 50 random Walsh-Hadamard sketches with m = 10N.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions CPLEX and Blendenpik (mex) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For robust variants, we set μ to be 5 times the minimum eigenvalue of ATΦTΦA, and for robust variants we set ρ = 1. ... The compression method used the counting sketch in all compressed methods with the exception of Blendenpik (mex) which used the Walsh-Hadamard random matrix vis the Spiral WHT Package (P uschel et al. 2005), and both Blendenpik (Avron, Maymounkov, and Toledo 2010) versions are set to use lowaccuracy for the LSQR step. The matrix A is 5 104 500 random matrix with condition number 106.