Egocentric Planning for Scalable Embodied Task Achievement
Authors: Xiatoian Liu, Hector Palacios, Christian Muise
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability, achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and winning the ALFRED challenge at CVPR Embodied AI workshop. |
| Researcher Affiliation | Collaboration | Xiaotian Liu Service Now Research Montreal, QC, Canada xiaotian.liu @mail.utoronto.ca Hector Palacios Service Now Research Montreal, QC, Canada hectorpal @gmail.com Christian Muise Queen s University Kingston, ON, Canada christian.muise @queensu.ca |
| Pseudocode | Yes | Algorithm 1 Iterative Exploration Replanning (IER) |
| Open Source Code | No | The paper mentions using a pre-trained model provided by FILM and converting its template-based result, but does not provide an explicit statement or link to their own open-source code for the methodology described. |
| Open Datasets | Yes | We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability, achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and winning the ALFRED challenge at CVPR Embodied AI workshop. |
| Dataset Splits | Yes | The ALFRED dataset contains a validation dataset which is split into 820 Validation Seen episodes and 821 Validation Unseen episodes. |
| Hardware Specification | Yes | Perception and Language module was fine-tuned on a Nvidia 3080. |
| Software Dependencies | No | The paper mentions models like U-Net and Mask-RCNN and using pre-trained FILM models, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The random exploration step is used to generate diverse set of object cluster for further exploration using our planner. Subsequently, at t = 500, the gathered information from the semantic spatial graph is converted into a PDDL problem for the agent. ... we allow our agent to first conduct 500 random exploration movements. |