Robust Partially-Compressed Least-Squares
Authors: Stephen Becker, Ban Kawas, Marek Petrik
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results, show that both partiallycompressed and robust partially-compressed solutions can outperform models that use full compression in terms of quality of solutions. We also show that compressed variants are more computationally efficient than ordinary least-squares, especially as dimensions grow. ... Results in Fig. 1 which we explain in detail in the experimental section demonstrate the main issue with standard methods as well as how we alleviate it. ... Empirical Results Our focus in this section is on the improvement of the solution error in comparison with the non-compressed least squares solution as well as the improvement over regular full compression. We also investigate the computational speed of the algorithms and show that partial compression is just as fast as full compression (and hence sometimes faster than standard least-squares), and that robust partial compression is only roughly twice as slow (and asymptotically it is the same cost). |
| Researcher Affiliation | Collaboration | Stephen Becker University of Colorado Boulder stephen.Becker@colorado.edu Ban Kawas IBM T.J. Watson Research Center bkawas@us.ibm.com Marek Petrik University of New Hampshire mpetrik@cs.unh.edu |
| Pseudocode | Yes | Algorithm 1: Efficient Algorithm for Solving RPC |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | We first use a data set from the National Health Interview Survey from 1992, containing 44085 rows and only 9 columns; since it is highly overcomplete and contains potentially sensitive data, it is a good candidate for sketching. |
| Dataset Splits | Yes | To test over this, we do 100 realizations of 5000 randomly chosen training data and 10000 testing data, and for each realization draw 50 random Walsh-Hadamard sketches with m = 10N. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper mentions CPLEX and Blendenpik (mex) but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For robust variants, we set μ to be 5 times the minimum eigenvalue of ATΦTΦA, and for robust variants we set ρ = 1. ... The compression method used the counting sketch in all compressed methods with the exception of Blendenpik (mex) which used the Walsh-Hadamard random matrix vis the Spiral WHT Package (P uschel et al. 2005), and both Blendenpik (Avron, Maymounkov, and Toledo 2010) versions are set to use lowaccuracy for the LSQR step. The matrix A is 5 104 500 random matrix with condition number 106. |