Breaking Locality Accelerates Block Gauss-Seidel

Authors: Stephen Tu, Shivaram Venkataraman, Ashia C. Wilson, Alex Gittens, Michael I. Jordan, Benjamin Recht

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we experimentally validate our theoretical results on how our accelerated algorithms can improve convergence rates. Our experiments use a combination of synthetic matrices and matrices from large scale machine learning tasks.
Researcher Affiliation Academia 1UC Berkeley, Berkeley, California, USA 2Rensselaer Polytechnic Institute, Troy, New York, USA. Correspondence to: Stephen Tu <stephent@berkeley.edu>.
Pseudocode Yes Algorithm 1 Accelerated randomized block Gauss-Seidel. Require: A Rn n, A 0, b Rn, sketching matrices {Sk}T 1 k=0 Rn p, x0 Rn, µ (0, 1), ν 1. 1: Set τ = p µ/ν. 2: Set y0 = z0 = x0. 3: for k = 0, ..., T 1 do 4: xk+1 = 1 1+τ yk + τ 1+τ zk. 5: Hk = Sk(ST k ASk) ST k . 6: yk+1 = xk+1 Hk(Axk+1 b). 7: zk+1 = zk + τ(xk+1 zk) τ µHk(Axk+1 b). 8: end for 9: Return y T .
Open Source Code No The paper does not include a statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes For the CIFAR-10 dataset, we augment the dataset to include five reflections, translations per-image and then apply standard pre-processing steps used in image classification (Coates & Ng, 2012; Sparks et al., 2017). We finally apply a Gaussian kernel on our pre-processed images and the resulting kernel matrix has n = 250000 coordinates. We also include experiments on a smaller MNIST kernel matrix (n = 60000) in Section A.7.
Dataset Splits No The paper mentions using CIFAR-10 and MNIST datasets and discusses error tolerances, but does not explicitly describe training, validation, or test splits. It only states 'n=250000 coordinates' for CIFAR-10 and 'n=60000' for MNIST, and refers to 'augmenting the dataset' and 'pre-processing steps'.
Hardware Specification Yes We run all our experiments on a 4 socket Intel Xeon CPU E7-8870 machine with 18 cores per socket and 1TB of DRAM.
Software Dependencies Yes We implement all our algorithms in Python using numpy, and use the Intel MKL library with 72 Open MP threads for numerical operations.
Experiment Setup Yes Setup. We run all our experiments on a 4 socket Intel Xeon CPU E7-8870 machine with 18 cores per socket and 1TB of DRAM. We implement all our algorithms in Python using numpy, and use the Intel MKL library with 72 Open MP threads for numerical operations. We report errors as relative errors, i.e. xk x 2 A/ x 2 A. Finally, we use the best values of µ and ν found by tuning each experiment.