reproducibilityindex.ai

Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following

Authors: Minjong Yoo, Jinwoo Jang, Wei-Jin Park, Honguk Woo

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments with Virtual Home, ALFRED, and CARLA, our approach demonstrates robustness against a variety of embodied instruction following scenarios involving different instruction scales and types, and non-stationarity degrees, and it consistently outperforms other state-of-the-art LLM-based task planning approaches in terms of both goal success rate and execution efficiency.
Researcher Affiliation	Collaboration	Minjong Yoo1, Jinwoo Jang1, Wei-jin Park2, Honguk Woo1 1Department of Computer Science and Engineering, Sungkyunkwan University 2Acryl Inc.
Pseudocode	Yes	Algorithm 1 Detailed implementation of Ex RAP framework
Open Source Code	Yes	We provide implementation details in Appendix, and also release source code.
Open Datasets	Yes	Through experiments with Virtual Home [8], ALFRED [9] and CARLA [10], we demonstrate that the Ex RAP framework achieves competitive performance in both task success and efficiency compared to several state-of-the-art embodied planning methods, including ZSP [11], Say Can [1], Prog Prompt [3], and LLM-Planner [12].
Dataset Splits	No	The paper uses 100 trajectories across 10 different environment settings in Virtual Home, and 50 trajectories in ALFRED and CARLA for in-context learning and evaluation, but does not specify explicit train/validation/test splits by percentage or sample count.
Hardware Specification	Yes	Our framework is implemented using Python v3.10 and trained on a system of an Intel(R) Core (TM) i9-10980XE processor and two NVIDIA RTX A6000 GPUs.
Software Dependencies	Yes	Our framework is implemented using Python v3.10 and trained on a system of an Intel(R) Core (TM) i9-10980XE processor and two NVIDIA RTX A6000 GPUs.
Experiment Setup	Yes	The hyperparameter settings for baselines are summarized in Table A.3. The hyperparameter settings for Ex RAP are summarized in Table A.4. (e.g., LLM Llama-3-8B (Default), Temperature 0.33, Filtering threshold θ in (8) 0.5, Weights for exploration value w R in (12) 1.0, Weights for exploitation value w T in (12) 0.01)