Towards Automated RISC-V Microarchitecture Design with Reinforcement Learning
Authors: Chen Bai, Jianwang Zhai, Yuzhe Ma, Bei Yu, Martin D. F. Wong
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | experiments using commercial electronic design automation (EDA) tools show that our method achieves an average PPA trade-off improvement of 16.03% than previous state-of-the-art approaches with 4.07 higher efficiency. The solution qualities outperform human implementations by at most 2.03 in the PPA trade-off. |
| Researcher Affiliation | Academia | Chen Bai1, Jianwang Zhai2 , Yuzhe Ma3, Bei Yu1 , Martin D.F. Wong4 1The Chinese University of Hong Kong 2Beijing University of Posts and Telecommunications 3The Hong Kong University of Science and Technology (Guangzhou) 4Hong Kong Baptist University |
| Pseudocode | No | The paper describes the methodology using text and mathematical equations, but it does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is publicly available at https://github.com/baichen318/rl-explorer. |
| Open Datasets | No | We use towers, vvadd, spmv from official RISC-V tests as workloads in the DSE. |
| Dataset Splits | No | The paper mentions the use of training data for PPA model calibration but does not explicitly describe train/validation/test splits for the RL agent's DSE task. It states: 'By leveraging around 800 900 Sonic BOOM microarchitecture designs, the Kendall τ for PPA modeling results can achieve higher than 0.92.' |
| Hardware Specification | Yes | All experiments are conducted on 80 Quad Intel(R) Xeon(R) CPU E7-4820 V3 cores with a 1 TB main memory. |
| Software Dependencies | Yes | Specifically, the performance, power, and area values are obtained from Synopsys VCS M2017.03, Synopsys Prime Time PX R-2020.09-SP1, and Cadence Genus 18.12-e012 1 with 7-nm technology (Clark et al. 2016). |
| Experiment Setup | Yes | The coefficient κ in Equation (5) is set as 1, ρ in Equation (6) is 0.5, λ in Equation (7) is 0.95 and the discount factor ζ in Equation (7) is 0.99. Adam optimizer is used, and the initial learning rate is 0.001. |