Tree-Structured Trajectory Encoding for Vision-and-Language Navigation

Authors: Xinzhe Zhou, Yadong Mu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On the benchmark dataset R2R, our model achieves a surpassing success rate (SR) of 68% on val-unseen and 66% on test. We further conduct extensive ablation studies and analyses to provide more insights for the effectiveness our designs.
Researcher Affiliation Academia Xinzhe Zhou1, Yadong Mu* 1,2 1 Wangxuan Institute of Computer Technology, Peking University 2 Peng Cheng Laboratory {zhouxinzhe1023, myd}@pku.edu.cn
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Descriptions of processes are given in paragraph form.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. There are no explicit statements about code release or repository links.
Open Datasets Yes Tab. 1 shows the results on the R2R (Anderson et al. 2018b) dataset. Besides R2R, We also tested our model on the more challenging Rx R (Ku et al. 2020) dataset.
Dataset Splits Yes about 10% samples in R2R val-unseen set involve at least one error-and-correction, despite the paths only take 4-6 steps. ... Tab. 1 shows the results on the R2R (Anderson et al. 2018b) dataset. As can be seen, our model achieves the best SR on the two unseen sets, especially test that is used by the online leaderboard, which demonstrates the effectiveness of our designs. The SR on val-seen is a bit lower than VLNHAMT (Chen et al. 2021)
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments. It discusses training models but without mentioning hardware.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. It mentions general components like 'LSTM' or 'transformer' but not specific software versions.
Experiment Setup Yes For implementation, we use a first-in-first-out queue of length k to conduct the filtering online. Throughout our experiments, we set k = 5 empirically. Later in experiments, we will show the detailed effect of varying the ks.