LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Authors: Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate its versatility through four use cases: developing content moderation models that perform similarly to GPT-4, building a safety benchmark, training instruction-following models that perform similarly to Vicuna, and creating challenging benchmark questions. The results are presented in Table 3.
Researcher Affiliation Academia 1 UC Berkeley 2 UC San Diego 3 Carnegie Mellon University 4 Stanford 5 MBZUAI
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The dataset is publicly available at https://huggingface.co/datasets/lmsys/lmsys-chat-1m. The code for this website is publicly available3. 3https://github.com/lm-sys/Fast Chat/tree/v0.2.26#serving-with-web-gui
Open Datasets Yes The dataset is publicly available at https://huggingface.co/datasets/lmsys/lmsys-chat-1m. LMSYS-Chat-1M is collected on our website2 from April to August 2023.
Dataset Splits No The paper describes data selection for training and evaluation sets for specific tasks, but does not provide specific train/validation/test splits or percentages for any of its models' training processes that would allow direct reproduction of the data partitioning.
Hardware Specification Yes We utilize dozens of A100 GPUs to host our website, serving a total of 25 models over the course of the timespan.
Software Dependencies Yes The text-moderation-latest (006) is the latest Open AI moderation API (Open AI, 2023b) introduced on 2023/8/25.
Experiment Setup Yes Instead of developing a classifier, we fine-tune a language model to generate explanations for why a particular message was flagged, based on the system prompt described in the moderation task (see Appendix B.2). The detailed system prompt and few-shot examples can be found in Appendix B.7.