On Grounded Planning for Embodied Tasks with Language Models

Authors: Bill Yuchen Lin, Chengsong Huang, Qian Liu, Wenda Gu, Sam Sommerer, Xiang Ren

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that the use of tables for encoding the environment and an iterative decoding strategy can significantly enhance the LMs ability in grounded planning. Our analysis also reveals interesting and non-trivial findings.
Researcher Affiliation Collaboration Bill Yuchen Lin1*, Chengsong Huang2*, Qian Liu3, Wenda Gu1, Sam Sommerer1, Xiang Ren1 1 University of Southern California 2 Fudan University 3Sea AI Lab
Pseudocode No The paper describes its methods verbally and with diagrams (Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No Our project website at https://inklab.usc.edu/G-Plan ET.
Open Datasets Yes To build a large-scale dataset for studying the G-Plan ET task, we re-use the goals and the plans of ALFRED and extract object information from AI2THOR for the aligned environment. The ALFRED dataset uses the AI2THOR engine to provide an interactive environment for agents with an egocentric vision to perform actions.
Dataset Splits Yes Data Split Unseen Room Layouts Seen Room Layouts train valid test # tasks 21,025 820 821 705 694
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions various models and tools used (BART, T5, GPT-J, TAPEX, AI2THOR) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup No Due to the page limit, we leave the details of the data, methods, and hyper-parameters in the Appendix that are linked to our project website.