reproducibilityindex.ai

Agent Planning with World Knowledge Model

Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines.
Researcher Affiliation	Collaboration	Zhejiang University National University of Singapore, NUS-NCS Joint Lab Alibaba Group Zhejiang Key Laboratory of Big Data Intelligent Computing
Pseudocode	No	No explicit pseudocode or algorithm blocks were found.
Open Source Code	Yes	3The code is available at https://github.com/zjunlp/WKM.
Open Datasets	Yes	We evaluate our method on three real-world simulated planning datasets: ALFWorld [41], Web Shop [53], and Science World [50].
Dataset Splits	Yes	Table 5: Dataset statistics. Dataset Train Text-Seen Text-Unseen ALFWorld 3,119 140 134 Web Shop 1,824 200 Science World 1,483 194 211
Hardware Specification	Yes	All the training and inference experiments are conducted on 8 NVIDIA V100 32G GPUs within 12 hours.
Software Dependencies	No	We fine-tune the proposed approach with Lo RA [12] using the Llama Factory [62] framework.
Experiment Setup	Yes	Table 6: Detailed hyperparameters used in our paper. lora r 8 lora alpha 16 lora dropout 0.05 lora target modules q_proj, v_proj cutoff len 2048 epochs 3 batch size 32 batch size per device 4 gradient accumulation steps 2 learning rate 1e-4 warmup ratio 0.03 temperature 0.0, 0.5 retrieved state knowledge N 3000 Pagent(Au) weight γ 0.4, 0.5, 0.7