OWL: A Large Language Model for IT Operations
Authors: Hongcheng Guo, Jian Yang, Jiaheng Liu, Liqun Yang, Linzheng Chai, Jiaqi Bai, Junran Peng, Xiaorong Hu, Chao Chen, Dongfeng Zhang, xu Shi, Tieqiao Zheng, liangfan zheng, Bo Zhang, Ke Xu, Zhoujun Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Further, we evaluate the performance of OWL on the Owl-Bench established by us and open IT-related benchmarks. OWL demonstrates superior performance results on IT tasks, which outperforms existing models by significant margins. |
| Researcher Affiliation | Collaboration | Hongcheng Guo1, Jian Yang1, , Jiaheng Liu1, , Liqun Yang1, Linzheng Chai1, Jiaqi Bai1, Junran Peng1, Xiaorong Hu2, Chao Chen2, Dongfeng Zhang2, Xu Shi2, Tieqiao Zheng2, Liangfan Zheng2, Bo Zhang2, Ke Xu1, Zhoujun Li1 1State Key Lab of Software Development Environment, Beihang University 2Cloudwise Research |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/HC-Guo/Owl. |
| Open Datasets | Yes | In this paper, we introduce the OWL, a large language model trained on our constructed Owl-Instruct with a wide range of IT-related information. |
| Dataset Splits | Yes | Specifically, we use the first 4000 log messages in each dataset to train and then test on the remaining logs. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types, or memory) used for running the experiments were found in the paper. |
| Software Dependencies | No | No specific ancillary software details with version numbers (e.g., libraries or solvers) needed to replicate the experiment were found, other than the base model LLa MA2-13b. |
| Experiment Setup | Yes | For instruction-tuning, the learning rate is 10 4, a weight decay of 0.1, a batch size of 16. The sequence length is 1024. We use Adam as the optimization algorithm with β1 = 0.9, β2 = 0.99, and ε = 10 8. The training epoch is 3. The rank and alpha of Lo RA (Hu et al., 2022) is 8 and 32. The dropout of Lo RA is 0.05. We train Lo RA for 10 epochs. |