reproducibilityindex.ai

Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

Authors: Shuhei Kurita, Kyunghyun Cho

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments, we show that the proposed generative approach outperforms the discriminative approach in the Room-2-Room (R2R) and Room-4-Room (R4R) datasets, especially in the unseen environments. We further show that the combination of the generative and discriminative policies achieves close to the state-of-the art results in the R2R dataset, demonstrating that the generative and discriminative policies capture the different aspects of VLN.
Researcher Affiliation	Academia	Shuhei Kurita AIP, RIKEN PRESTO, JST shuhei.kurita@riken.jp Kyunghyun Cho Courant Institute, New York University Center for Data Science, New York University CIFAR Fellow kyunghyun.cho@nyu.edu
Pseudocode	No	No pseudocode or algorithm blocks are explicitly present in the paper.
Open Source Code	Yes	The source code is available at https://github.com/shuheikurita/glgp.
Open Datasets	Yes	We conduct our experiments on the R2R navigation task (Anderson et al., 2018b), which is widely used for evaluating language-grounded navigation models and R4R (Jain et al., 2019), which consists of longer and more complex paths when compared to R2R.
Dataset Splits	Yes	R2R contains four splits of data: train, validation-seen, validation-unseen and test-unseen. From the 90 scenes of Matterport 3D modelings (Chang et al., 2017), 61 scenes are pooled together and used as seen environments in both the training and validation-seen sets. Among the remaining scenes, 11 scenes form the validation-unseen set and 18 scenes the test-unseen set. ... The training set has 14,025 instructions, while the validation-seen and validation-unseen datasets have 1,020 and 2,349 instructions respectively.
Hardware Specification	Yes	We use a single NVIDIA V100 GPU for training.
Software Dependencies	No	The paper mentions using a neural network architecture and following existing codebases but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We use the minibatch-size of 25. We use the validation-unseen dataset to select hyperparameters. We use the mixture of supervised learning and imitation learning (Tan et al., 2019; Li et al., 2019) for both the generative and discriminative policies, which are referred as teacher-forcing and studentforcing (Anderson et al., 2018b). In particular, during training between the reference action a T and a sampled action a S, we select the next action by a = δa S + (1 δ)a T where δ Bernoulli(η) following Li et al. (2019). We examine η [0, 1/5, 1/3, 1/2, 1] using the validation set and choose η = 1/3. ... β [0, 1] is a hyperparameter... In our experiment, we report the score of β = 0.5.