Tree-Planner: Efficient Close-loop Task Planning with Large Language Models
Authors: Mengkang Hu, Yao Mu, Xinmiao Chelsey Yu, Mingyu Ding, Shiguang Wu, Wenqi Shao, Qiguang Chen, Bin Wang, Yu Qiao, Ping Luo
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that TREE-PLANNER achieves state-of-the-art performance while maintaining high efficiency. |
| Researcher Affiliation | Collaboration | Corresponding authors: Mingyu Ding and Ping Luo ({dingmyu, pluo.lhi}@gmail.com). The University of Hong Kong. Harbin Institute of Technology. Noah s Ark Laboratory. Shanghai AI Laboratory. |
| Pseudocode | Yes | Algorithm 1: Action Tree Construction Input : c, r Output: r |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for their methodology is open-source or publicly available. |
| Open Datasets | Yes | Environment. We conduct the experiments in the Virtual Home (VH) Environment (Puig et al., 2018), a simulation platform for household tasks. Dataset. We constructed a dataset consisting of 4 VH scenes and 35 unique VH tasks. Each task includes a task name, goal conditions, and a gold plan. We started by annotating goal conditions for each task from Activity Programs knowledge base by Puig et al. (2018) via executing the programs. |
| Dataset Splits | Yes | We take 4 representative tasks from the dataset as in-context learning exemplars and the rest as the validation set. |
| Hardware Specification | No | The paper mentions using the 'Open AI GPT-3.5 (text-davinci-003) API' but does not specify any particular hardware used for running their experiments. |
| Software Dependencies | No | The paper mentions using 'Open AI GPT-3.5 (text-davinci-003) API' and 'BERT similarity' (linking to sbert.net) but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | To sample diverse plans, we applied a temperature of 0.8 and a top-p value of 0.95. During grounded deciding, we set the temperature to 0.7, top-p to 1.0, and sampling parameter n to 20. Additionally, we utilize a majority vote to obtain the final option in order to alleviate format errors in the output of LLMs. The maximum number of error corrections is set to 10 for all evaluated approaches. ... In the case of Grounded Deciding, the optimal hyperparameter combination was found to be a temperature of 0.7 and topp of 1.0. As for ITERATIVE-PLANNER, the optimal hyperparameter combination was a temperature of 0 and topp of 1.0. |