Agent Planning with World Knowledge Model

Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines.
Researcher Affiliation Collaboration Zhejiang University National University of Singapore, NUS-NCS Joint Lab Alibaba Group Zhejiang Key Laboratory of Big Data Intelligent Computing
Pseudocode No No explicit pseudocode or algorithm blocks were found.
Open Source Code Yes 3The code is available at https://github.com/zjunlp/WKM.
Open Datasets Yes We evaluate our method on three real-world simulated planning datasets: ALFWorld [41], Web Shop [53], and Science World [50].
Dataset Splits Yes Table 5: Dataset statistics. Dataset Train Text-Seen Text-Unseen ALFWorld 3,119 140 134 Web Shop 1,824 200 Science World 1,483 194 211
Hardware Specification Yes All the training and inference experiments are conducted on 8 NVIDIA V100 32G GPUs within 12 hours.
Software Dependencies No We fine-tune the proposed approach with Lo RA [12] using the Llama Factory [62] framework.
Experiment Setup Yes Table 6: Detailed hyperparameters used in our paper. lora r 8 lora alpha 16 lora dropout 0.05 lora target modules q_proj, v_proj cutoff len 2048 epochs 3 batch size 32 batch size per device 4 gradient accumulation steps 2 learning rate 1e-4 warmup ratio 0.03 temperature 0.0, 0.5 retrieved state knowledge N 3000 Pagent(Au) weight γ 0.4, 0.5, 0.7