Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following

Authors: Minjong Yoo, Jinwoo Jang, Wei-Jin Park, Honguk Woo

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments with Virtual Home, ALFRED, and CARLA, our approach demonstrates robustness against a variety of embodied instruction following scenarios involving different instruction scales and types, and non-stationarity degrees, and it consistently outperforms other state-of-the-art LLM-based task planning approaches in terms of both goal success rate and execution efficiency.
Researcher Affiliation Collaboration Minjong Yoo1, Jinwoo Jang1, Wei-jin Park2, Honguk Woo1 1Department of Computer Science and Engineering, Sungkyunkwan University 2Acryl Inc.
Pseudocode Yes Algorithm 1 Detailed implementation of Ex RAP framework
Open Source Code Yes We provide implementation details in Appendix, and also release source code.
Open Datasets Yes Through experiments with Virtual Home [8], ALFRED [9] and CARLA [10], we demonstrate that the Ex RAP framework achieves competitive performance in both task success and efficiency compared to several state-of-the-art embodied planning methods, including ZSP [11], Say Can [1], Prog Prompt [3], and LLM-Planner [12].
Dataset Splits No The paper uses 100 trajectories across 10 different environment settings in Virtual Home, and 50 trajectories in ALFRED and CARLA for in-context learning and evaluation, but does not specify explicit train/validation/test splits by percentage or sample count.
Hardware Specification Yes Our framework is implemented using Python v3.10 and trained on a system of an Intel(R) Core (TM) i9-10980XE processor and two NVIDIA RTX A6000 GPUs.
Software Dependencies Yes Our framework is implemented using Python v3.10 and trained on a system of an Intel(R) Core (TM) i9-10980XE processor and two NVIDIA RTX A6000 GPUs.
Experiment Setup Yes The hyperparameter settings for baselines are summarized in Table A.3. The hyperparameter settings for Ex RAP are summarized in Table A.4. (e.g., LLM Llama-3-8B (Default), Temperature 0.33, Filtering threshold θ in (8) 0.5, Weights for exploration value w R in (12) 1.0, Weights for exploitation value w T in (12) 0.01)