Energy-Efficient Gaussian Processes Using Low-Precision Arithmetic
Authors: Nicolas Alder, Ralf Herbrich
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore how low-precision representations impact the results of Gaussian process regression and how data set properties, implementation approach, model performance, and energy consumption interact. Our findings show that a well-conditioned kernel matrix allows reducing the energy consumption by up to 89.01% for 98.08% of arithmetic operations with little to no impact on model performance. Our experimental setup includes software-based experiments for estimating performance and tracking arithmetic operations, as well as hardware-based experiments for benchmarking power consumption of arithmetic operations. |
| Researcher Affiliation | Academia | 1Hasso Plattner Institute, Potsdam, Germany. |
| Pseudocode | Yes | The Appendix provides a detailed discussion of both algorithms, including pseudocode and error bounds derived from numerical linear algebra. Algorithm 1 Gaussian Process Regression with the Cholesky Banachiewicz Decomposition (Rasmussen & Williams, 2006) and Algorithm 2 Conjugate Gradient for Gaussian Process Regression (Hestenes & Stiefel, 1952; Davies, 2005; Maddox et al., 2022) and Algorithm 3 Modified Conjugate Gradient for Gaussian Process Regression (Maddox et al., 2022) are provided in Appendix C. |
| Open Source Code | Yes | All source code is available at https://github.com/nicolas-alder/energy-efficient-gps |
| Open Datasets | Yes | We selected six different regression datasets for experimentation. Five of them were arbitrarily chosen from the Penn Machine Learning Benchmark (Romano et al., 2021) to include diverse dataset properties. The sixth dataset, the California Housing dataset (Pedregosa et al., 2011), was specifically chosen to introduce a dataset that produces a very ill-conditioned kernel matrix. |
| Dataset Splits | Yes | From each dataset, we have randomly drawn 500 samples as a train set and 500 samples as a test set. |
| Hardware Specification | Yes | We used a Digilent Genesys2 with Xilinx Kintex-7 (XC7K325T-2FFG900C) FPGA-Core and Texas Instruments INA219 chip as power monitor. |
| Software Dependencies | No | The paper states: 'We utilized the GMPY21 library to implement Gaussian process regression in Python, which supports arbitrary-precision floating-point representations.' However, it does not provide specific version numbers for Python or the GMPY21 library, which are required for full reproducibility. |
| Experiment Setup | Yes | We performed 20 iterations to compute the predictive mean using the conjugate gradient approach. For the predictive covariance of each test point, we used 5 iterations. In the case of the ill-conditioned California Housing dataset, we used 100 iterations for the predictive mean. For each dataset, we evaluated mantissa precisions with t = 3, 4, . . . , 8 in onebit increments and 14, 24, 34, 44, 53 in 10 bit increments up to double-precision. The exponent e was always kept at 11-bits. To ensure the validity of a real-world setting in our experiments, we used the BFGS algorithm, which is also employed in scikit-learn, to determine the length-scale l in (1). |