reproducibilityindex.ai

Layout-Aware Dreamer for Embodied Visual Referring Expression Grounding

Authors: Mingxiao Li, Zehao Wang, Tinne Tuytelaars, Marie-Francine Moens

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our agent achieves new state-of-the-art performance on the public leaderboard of the REVERIE dataset in challenging unseen test environments with improvement in navigation success (SR) by 4.02% and remote grounding success (RGS) by 3.43% compared to the previous state-of-the-art.
Researcher Affiliation	Academia	1 Computer Science Department of KU Leuven 2 Electrical Engineering Department (ESAT-PSI) of KU Leuven
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks, nor any clearly labeled algorithm sections or code-like formatted procedures.
Open Source Code	Yes	The code is released at https://github.com/zehao-wang/LAD
Open Datasets	Yes	Because the navigation task is characterized by realistic high-level instructions, we conduct experiments and evaluate our agent on the embodied goal-oriented benchmark REVERIE (Qi et al. 2020) and the SOON (Song et al. 2022) datasets.
Dataset Splits	Yes	The dataset is split into four sets, including the training set, validation seen set, validation unseen set, and test set.
Hardware Specification	Yes	The whole training procedure takes two days with a single NVIDIA-P100 GPU.
Software Dependencies	No	The paper mentions using GLIDE (Nichol et al. 2022) and CLIP (Radford et al. 2021) for data preprocessing and feature extraction but does not provide specific version numbers for these or any other software dependencies, programming languages, or libraries used for the experiments.
Experiment Setup	Yes	The model is trained for 100k iterations with a batch size of 32 for single action prediction and 50k iterations with a batch size of 8 for imitation learning with DAgger (Ross, Gordon, and Bagnell 2011). We optimize both phases by the Adam W (Loshchilov and Hutter 2018) optimizer with a learning rate of 5e-5 and 1e-5, respectively.