Position: Building Guardrails for Large Language Models Requires Systematic Design
Authors: Yi Dong, Ronghui Mu, Gaojie Jin, Yi Qi, Jinwei Hu, Xingyu Zhao, Jie Meng, Wenjie Ruan, Xiaowei Huang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This position paper takes a deep look at current open-source solutions (Llama Guard, Nvidia Ne Mo, Guardrails AI), and discusses the challenges and the road towards building more complete solutions. Drawing on robust evidence from previous research, we advocate for a systematic approach to construct guardrails for LLMs, based on comprehensive consideration of diverse contexts across various LLMs applications. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Liverpool, UK 2Key Laboratory of System Software (Chinese Academy of Sciences) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences 3WMG, University of Warwick, Warwick, UK 4Institute of Digital Technologies, Loughborough University London, UK. |
| Pseudocode | No | The paper includes workflow diagrams (Figure 1, Figure 2, Figure 3) but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper discusses existing open-source guardrail solutions developed by others (Llama Guard, Nvidia Ne Mo, Guardrails AI) but does not provide its own source code for the systematic design it advocates. |
| Open Datasets | No | The paper discusses concepts related to LLM training data and references other works that use datasets, but it does not specify or provide access information for any datasets used in its own research or for training methods proposed by the authors. |
| Dataset Splits | No | The paper discusses verification and validation as important aspects for building guardrails, but it does not provide specific dataset splits (training, validation, test) for any experiments conducted by the authors in this paper. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to conduct the research or analysis presented. |
| Software Dependencies | No | The paper mentions various software frameworks (e.g., Llama Guard, Nvidia Ne Mo) but does not list specific software dependencies with version numbers required to replicate any part of the authors' research. |
| Experiment Setup | No | The paper is a position paper and does not describe an experimental setup, hyperparameters, or training settings for any new models or methods developed by the authors. |