reproducibilityindex.ai

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline

Authors: Zangwei Zheng, Xiaozhe Ren, Fuzhao Xue, Yang Luo, Xin Jiang, Yang You

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on real-world instruction datasets using the LLa MA-based model, and our results demonstrate an impressive 86% improvement in inference throughput without compromising effectiveness.
Researcher Affiliation	Collaboration	1Department of Computer Science, National University of Singapore 2Noah s Ark Lab, Huawei. {zangwei, f-xue, yangluo, youy}@comp.nus.edu.sg; {renxiaozhe, jiang.xin}@huawei.com
Pseudocode	No	The paper includes diagrams illustrating the pipeline (Figure 1) but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	https://github.com/zhengzangw/Sequence-Scheduling
Open Datasets	Yes	Our experiments are conducted on two datasets: a set of 10,000 prompts from a subset of the alpaca dataset [33] (which is different from the one used to train the length predictor) and a set of 429 prompts from the Instruction-in-Wild datasets [36].
Dataset Splits	No	The paper states that experiments are conducted on two datasets: 'a set of 10,000 prompts from a subset of the alpaca dataset [33]... and a set of 429 prompts from the Instruction-in-Wild datasets [36].' It does not specify train/validation/test splits for the overall experimental evaluation, implying the full datasets are used for evaluation.
Hardware Specification	Yes	The inference is performed on the Vicuna-7B [4] model using an 80GB A100 GPU. The training was conducted on a single 80GB A100 GPU.
Software Dependencies	No	The paper states 'All codes are implemented in Py Torch [26]' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	For our baseline experiments, we set the batch size to 16. Regarding the variable batch size strategy, we use a batch size of 16 for instructions with a length (L) greater than or equal to 300...We maintain a fixed group size of 256. We sample generations with a temperature of 0.5 for diversity in responses. Specifically, we set the learning rate to 0.00005 and trained the model for three epochs.