OWL: A Large Language Model for IT Operations

Authors: Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, xu Shi, Tieqiao Zheng, liangfan zheng, Bo Zhang, Ke Xu, Zhoujun Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Further, we evaluate the performance of OWL on the Owl-Bench established by us and open IT-related benchmarks. OWL demonstrates superior performance results on IT tasks, which outperforms existing models by significant margins.
Researcher Affiliation Collaboration Hongcheng Guo1, Jian Yang1, , Jiaheng Liu1, , Liqun Yang1, Linzheng Chai1, Jiaqi Bai1, Junran Peng1, Xiaorong Hu2, Chao Chen2, Dongfeng Zhang2, Xu Shi2, Tieqiao Zheng2, Liangfan Zheng2, Bo Zhang2, Ke Xu1, Zhoujun Li1 1State Key Lab of Software Development Environment, Beihang University 2Cloudwise Research
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/HC-Guo/Owl.
Open Datasets Yes In this paper, we introduce the OWL, a large language model trained on our constructed Owl-Instruct with a wide range of IT-related information.
Dataset Splits Yes Specifically, we use the first 4000 log messages in each dataset to train and then test on the remaining logs.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, processor types, or memory) used for running the experiments were found in the paper.
Software Dependencies No No specific ancillary software details with version numbers (e.g., libraries or solvers) needed to replicate the experiment were found, other than the base model LLa MA2-13b.
Experiment Setup Yes For instruction-tuning, the learning rate is 10 4, a weight decay of 0.1, a batch size of 16. The sequence length is 1024. We use Adam as the optimization algorithm with β1 = 0.9, β2 = 0.99, and ε = 10 8. The training epoch is 3. The rank and alpha of Lo RA (Hu et al., 2022) is 8 and 32. The dropout of Lo RA is 0.05. We train Lo RA for 10 epochs.