BetterV: Controlled Verilog Generation with Discriminative Guidance

Authors: Zehua Pei, Huiling Zhen, Mingxuan Yuan, Yu Huang, Bei Yu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we propose a Verilog generation framework, Better V, which fine-tunes large language models (LLMs) on processed domain-specific datasets and incorporates generative discriminators for guidance on particular design demands. Better V has the ability to generate syntactically and functionally correct Verilog, outperforming GPT-4 on the Verilog Eval benchmark. With the help of task-specific generative discriminators, Better V achieves remarkable improvements on various electronic design automation (EDA) downstream tasks, including netlist node reduction for synthesis and verification runtime reduction with Boolean Satisfiability (SAT) solving.
Researcher Affiliation Collaboration 1The Chinese University of Hong Kong, Hong Kong SAR 2Noah s Ark Lab, Huawei, Hong Kong SAR. Correspondence to: Bei Yu <byu@cse.cuhk.edu.hk>.
Pseudocode No The paper includes figures (e.g., Figure 1, Figure 2, Figure 3) and mathematical equations describing the model, but no clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code No The paper does not explicitly state that the authors are releasing their source code for the described methodology or provide a link to a code repository.
Open Datasets Yes We employ the Verilog Eval (Liu et al., 2023b), which comprises various problems with either machine-generated or human-crafted, as our evaluation benchmark.
Dataset Splits No The paper mentions fine-tuning LLMs and training for a certain number of epochs with batch sizes, but it does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) for the collected dataset.
Hardware Specification Yes The experiments are conducted on a machine with two NVIDIA Tesla V100S PCIe 32 GB graphics cards with CUDA driver 11.4.
Software Dependencies Yes The experiments are conducted on a machine with two NVIDIA Tesla V100S PCIe 32 GB graphics cards with CUDA driver 11.4.
Experiment Setup Yes For all the models, we employ the Adam optimizer (Kingma & Ba, 2014) with β1 = 0.9 and β2 = 0.95 and the cosine learning rate decay (Loshchilov & Hutter, 2016) to schedule our learning rate. For the generative LLMs fine-tuning process, we train it for 4 epochs using an initial learning rate of 9.65e-6 with a batch size of 4. For the generative discriminator, we train it for 3 epochs using an initial learning rate of 9.65e-6 with a batch size of 8. The Lo RA dimension for both LLMs and discriminator is set as 128.