Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following
Authors: Minjong Yoo, Jinwoo Jang, Wei-Jin Park, Honguk Woo
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments with Virtual Home, ALFRED, and CARLA, our approach demonstrates robustness against a variety of embodied instruction following scenarios involving different instruction scales and types, and non-stationarity degrees, and it consistently outperforms other state-of-the-art LLM-based task planning approaches in terms of both goal success rate and execution efficiency. |
| Researcher Affiliation | Collaboration | Minjong Yoo1, Jinwoo Jang1, Wei-jin Park2, Honguk Woo1 1Department of Computer Science and Engineering, Sungkyunkwan University 2Acryl Inc. |
| Pseudocode | Yes | Algorithm 1 Detailed implementation of Ex RAP framework |
| Open Source Code | Yes | We provide implementation details in Appendix, and also release source code. |
| Open Datasets | Yes | Through experiments with Virtual Home [8], ALFRED [9] and CARLA [10], we demonstrate that the Ex RAP framework achieves competitive performance in both task success and efficiency compared to several state-of-the-art embodied planning methods, including ZSP [11], Say Can [1], Prog Prompt [3], and LLM-Planner [12]. |
| Dataset Splits | No | The paper uses 100 trajectories across 10 different environment settings in Virtual Home, and 50 trajectories in ALFRED and CARLA for in-context learning and evaluation, but does not specify explicit train/validation/test splits by percentage or sample count. |
| Hardware Specification | Yes | Our framework is implemented using Python v3.10 and trained on a system of an Intel(R) Core (TM) i9-10980XE processor and two NVIDIA RTX A6000 GPUs. |
| Software Dependencies | Yes | Our framework is implemented using Python v3.10 and trained on a system of an Intel(R) Core (TM) i9-10980XE processor and two NVIDIA RTX A6000 GPUs. |
| Experiment Setup | Yes | The hyperparameter settings for baselines are summarized in Table A.3. The hyperparameter settings for Ex RAP are summarized in Table A.4. (e.g., LLM Llama-3-8B (Default), Temperature 0.33, Filtering threshold θ in (8) 0.5, Weights for exploration value w R in (12) 1.0, Weights for exploitation value w T in (12) 0.01) |