Data-Driven Offline Optimization for Architecting Hardware Accelerators
Authors: Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results show that PRIME architects hardware accelerators that improve over the best design in the training dataset, on average, by 2.46 (up to 6.7 ) when specializing for a single application. and 6 EXPERIMENTAL EVALUATION Our evaluations aim to answer the following questions: Q(1) Can PRIME design accelerators tailored for a given application that are better than the best observed configuration in the training dataset, and comparable to or better than state-of-the-art simulation-driven methods under a given simulator-query budget? Q(2) Does PRIME reduce the total simulation time compared to other methods? Q(3) Can PRIME produce hardware accelerators for a family of different applications? Q(4) Can PRIME trained for a family of applications extrapolate to designing a high-performing accelerator for a new, unseen application, thereby enabling data reuse? |
| Researcher Affiliation | Collaboration | Aviral Kumar , Amir Yazdanbakhsh Milad Hashemi Kevin Swersky Sergey Levine Google Research UC Berkeley ( Equal Contribution) aviralk@berkeley.edu, ayazdan@google.com |
| Pseudocode | Yes | Algorithm 1 outlines our overall system for accelerator design. |
| Open Source Code | No | No explicit statement or link for open-source code for the methodology described in this paper. |
| Open Datasets | No | We used an offline dataset D of (accelerator parameters, latency) via random sampling from the space of 452M possible accelerator configurations. The paper does not provide concrete access information (link, DOI, specific repository name, or formal citation with authors/year) for this dataset. |
| Dataset Splits | Yes | For each training run, we hold out the best 20% of the points out of the training set and use them only for cross-validation as follows. |
| Hardware Specification | No | The paper discusses hardware accelerators as the subject of its research (e.g., Google TPUs, Nvidia GPUs) but does not specify the hardware used to run the experiments, simulations, or training of its models. |
| Software Dependencies | No | The paper mentions the use of 'Adam [32] optimizer' but does not provide specific version numbers for any software components, libraries, or frameworks used in its implementation. |
| Experiment Setup | Yes | The hyperparameters for training the conservative surrogate in Equations 3 and its contextual version are as follows: ... Optimizer/learning rate for training fθ(x). Adam, 1e 4, default β1 = 0.9, β2 = 0.999. ... Ranges of α, β. We trained several fθ(x) models with α [0.0, 0.01, 0.1, 0.5, 1.0, 5.0] and β [0.0, 0.01, 5.0, 0.1, 1.0]. ... Negative sampling with Opt( ). ... we refresh (i.e., reinitialize) the firefly parameters after every p = 20000 gradient steps of training the conservative surrogate, and run t = 5 steps of firefly optimization per gradient step taken on the conservative surrogate. |