Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers
Authors: Jyotikrishna Dass, Rabi Mahapatra
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform thorough empirical analysis with large synthetic and real datasets to evaluate the performance of Householder sketch and compare with (Maalouf et al., 2019). |
| Researcher Affiliation | Academia | Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA. Correspondence to: Jyotikrishna Dass <dass.jyotikrishna@tamu.edu>. |
| Pseudocode | Yes | Algorithm 1 HOUSEHOLDER-SKETCH(X, y); see Theorem 2.2 |
| Open Source Code | Yes | We have open-sourced our codes here. |
| Open Datasets | Yes | We used following datasets for evaluation, and fair comparison of LMS-QR performance with the default LMS solvers (with cross validation), and with Fast Caratheodory coreset based LMS-BOOST (Maalouf et al., 2019). (i) Synthetic data (X, y) comprising uniform random entries in [0, 100) for sequential experiments. (ii) 3D Road network(Kaul et al., 2013) dataset with n = 434, 874 data samples. (iii) Individual household electric power consumption (Hebrail & Berard, 2012) dataset with n = 2, 075, 259 data samples. |
| Dataset Splits | Yes | Cross validation folds, |m|= 3 for synthetic datasets (a)-(j) and |m|= 2 for real datasets (k)-(l) |
| Hardware Specification | Yes | We used Google Colab to run our experiments with the above LMS-QR algorithms via Python3 Google Compute Engine running on Intel Xeon CPU @ 2.20GHz and 25 GB RAM. ... For distributed experiments, we used the Anaconda Python distribution and MPI for Python (mpi4py) package on the Texas A&M University HPRC Ada computing cluster of Intel Xeon CPU @ 2.5GHz. |
| Software Dependencies | No | To implement Algorithm 1, we use LAPACK.dgeqrf(), and LAPACK.dormqr() subroutines for HOUSEHOLDER-QR, and MULTIPLY-QC, respectively. ... We used Google Colab to run our experiments with the above LMS-QR algorithms via Python3 ... We used following datasets for evaluation, and fair comparison of LMS-QR performance with the default LMS solvers (with cross validation), and with Fast Caratheodory coreset based LMS-BOOST (Maalouf et al., 2019). ... For distributed experiments, we used the Anaconda Python distribution and MPI for Python (mpi4py) package... Linear algebra was handled by LAPACK/BLAS, through the Intel Math Kernel Library. |
| Experiment Setup | Yes | For various size of hyper-parameter set for cross validation, |A|= {50, 100, 200, 300}, for cross validation, Figure 1 (g)-(i) depict LMS-QR to be consistently faster than LMS-CV and LMS-BOOST. We observe similar trend for the real datasets in Figure 1 (k)-(l). ... Each test was performed 20 times, and the best result was chosen. |