BetterV: Controlled Verilog Generation with Discriminative Guidance
Authors: Zehua Pei, Huiling Zhen, Mingxuan Yuan, Yu Huang, Bei Yu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we propose a Verilog generation framework, Better V, which fine-tunes large language models (LLMs) on processed domain-specific datasets and incorporates generative discriminators for guidance on particular design demands. Better V has the ability to generate syntactically and functionally correct Verilog, outperforming GPT-4 on the Verilog Eval benchmark. With the help of task-specific generative discriminators, Better V achieves remarkable improvements on various electronic design automation (EDA) downstream tasks, including netlist node reduction for synthesis and verification runtime reduction with Boolean Satisfiability (SAT) solving. |
| Researcher Affiliation | Collaboration | 1The Chinese University of Hong Kong, Hong Kong SAR 2Noah s Ark Lab, Huawei, Hong Kong SAR. Correspondence to: Bei Yu <byu@cse.cuhk.edu.hk>. |
| Pseudocode | No | The paper includes figures (e.g., Figure 1, Figure 2, Figure 3) and mathematical equations describing the model, but no clearly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | The paper does not explicitly state that the authors are releasing their source code for the described methodology or provide a link to a code repository. |
| Open Datasets | Yes | We employ the Verilog Eval (Liu et al., 2023b), which comprises various problems with either machine-generated or human-crafted, as our evaluation benchmark. |
| Dataset Splits | No | The paper mentions fine-tuning LLMs and training for a certain number of epochs with batch sizes, but it does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) for the collected dataset. |
| Hardware Specification | Yes | The experiments are conducted on a machine with two NVIDIA Tesla V100S PCIe 32 GB graphics cards with CUDA driver 11.4. |
| Software Dependencies | Yes | The experiments are conducted on a machine with two NVIDIA Tesla V100S PCIe 32 GB graphics cards with CUDA driver 11.4. |
| Experiment Setup | Yes | For all the models, we employ the Adam optimizer (Kingma & Ba, 2014) with β1 = 0.9 and β2 = 0.95 and the cosine learning rate decay (Loshchilov & Hutter, 2016) to schedule our learning rate. For the generative LLMs fine-tuning process, we train it for 4 epochs using an initial learning rate of 9.65e-6 with a batch size of 4. For the generative discriminator, we train it for 3 epochs using an initial learning rate of 9.65e-6 with a batch size of 8. The Lo RA dimension for both LLMs and discriminator is set as 128. |