Energy-Efficient Gaussian Processes Using Low-Precision Arithmetic

Authors: Nicolas Alder, Ralf Herbrich

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore how low-precision representations impact the results of Gaussian process regression and how data set properties, implementation approach, model performance, and energy consumption interact. Our findings show that a well-conditioned kernel matrix allows reducing the energy consumption by up to 89.01% for 98.08% of arithmetic operations with little to no impact on model performance. Our experimental setup includes software-based experiments for estimating performance and tracking arithmetic operations, as well as hardware-based experiments for benchmarking power consumption of arithmetic operations.
Researcher Affiliation Academia 1Hasso Plattner Institute, Potsdam, Germany.
Pseudocode Yes The Appendix provides a detailed discussion of both algorithms, including pseudocode and error bounds derived from numerical linear algebra. Algorithm 1 Gaussian Process Regression with the Cholesky Banachiewicz Decomposition (Rasmussen & Williams, 2006) and Algorithm 2 Conjugate Gradient for Gaussian Process Regression (Hestenes & Stiefel, 1952; Davies, 2005; Maddox et al., 2022) and Algorithm 3 Modified Conjugate Gradient for Gaussian Process Regression (Maddox et al., 2022) are provided in Appendix C.
Open Source Code Yes All source code is available at https://github.com/nicolas-alder/energy-efficient-gps
Open Datasets Yes We selected six different regression datasets for experimentation. Five of them were arbitrarily chosen from the Penn Machine Learning Benchmark (Romano et al., 2021) to include diverse dataset properties. The sixth dataset, the California Housing dataset (Pedregosa et al., 2011), was specifically chosen to introduce a dataset that produces a very ill-conditioned kernel matrix.
Dataset Splits Yes From each dataset, we have randomly drawn 500 samples as a train set and 500 samples as a test set.
Hardware Specification Yes We used a Digilent Genesys2 with Xilinx Kintex-7 (XC7K325T-2FFG900C) FPGA-Core and Texas Instruments INA219 chip as power monitor.
Software Dependencies No The paper states: 'We utilized the GMPY21 library to implement Gaussian process regression in Python, which supports arbitrary-precision floating-point representations.' However, it does not provide specific version numbers for Python or the GMPY21 library, which are required for full reproducibility.
Experiment Setup Yes We performed 20 iterations to compute the predictive mean using the conjugate gradient approach. For the predictive covariance of each test point, we used 5 iterations. In the case of the ill-conditioned California Housing dataset, we used 100 iterations for the predictive mean. For each dataset, we evaluated mantissa precisions with t = 3, 4, . . . , 8 in onebit increments and 14, 24, 34, 44, 53 in 10 bit increments up to double-precision. The exponent e was always kept at 11-bits. To ensure the validity of a real-world setting in our experiments, we used the BFGS algorithm, which is also employed in scikit-learn, to determine the length-scale l in (1).