Egocentric Planning for Scalable Embodied Task Achievement

Authors: Xiatoian Liu, Hector Palacios, Christian Muise

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability, achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and winning the ALFRED challenge at CVPR Embodied AI workshop.
Researcher Affiliation Collaboration Xiaotian Liu Service Now Research Montreal, QC, Canada xiaotian.liu @mail.utoronto.ca Hector Palacios Service Now Research Montreal, QC, Canada hectorpal @gmail.com Christian Muise Queen s University Kingston, ON, Canada christian.muise @queensu.ca
Pseudocode Yes Algorithm 1 Iterative Exploration Replanning (IER)
Open Source Code No The paper mentions using a pre-trained model provided by FILM and converting its template-based result, but does not provide an explicit statement or link to their own open-source code for the methodology described.
Open Datasets Yes We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability, achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and winning the ALFRED challenge at CVPR Embodied AI workshop.
Dataset Splits Yes The ALFRED dataset contains a validation dataset which is split into 820 Validation Seen episodes and 821 Validation Unseen episodes.
Hardware Specification Yes Perception and Language module was fine-tuned on a Nvidia 3080.
Software Dependencies No The paper mentions models like U-Net and Mask-RCNN and using pre-trained FILM models, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The random exploration step is used to generate diverse set of object cluster for further exploration using our planner. Subsequently, at t = 500, the gathered information from the semantic spatial graph is converted into a PDDL problem for the agent. ... we allow our agent to first conduct 500 random exploration movements.