reproducibilityindex.ai

Frequency-Enhanced Data Augmentation for Vision-and-Language Navigation

Authors: Keji He, Chenyang Si, Zhihe Lu, Yan Huang, Liang Wang, Xinchao Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Promising results on R2R, Rx R, CVDN and REVERIE demonstrate that our FDA can be readily integrated with existing VLN approaches, improving performance without adding extra parameters, and keeping models simple and efficient. We simply investigate the sensitivity of benchmark methods to low and high-frequency information by perturbing the low-frequency or high-frequency components in images. Three powerful baseline models, i.e., HAMT [9], DUET [10], and TD-STP [64], are used to analyze the significance of low/high-frequency information on both R2R validation seen and unseen splits, wherein the navigation views are disrupted in the Fourier domain. As Shown in Figure 2, the three models maintain a relatively high Success Rate (SR) under low-frequency perturbations.
Researcher Affiliation	Academia	1Center for Research on Intelligent Perception and Computing National Key Laboratory for Multi-modal Artificial Intelligence Systems Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3National University of Singapore 4Nanyang Technological University
Pseudocode	No	The paper contains diagrams illustrating the proposed approach (e.g., Figure 4), but these are visual representations and not structured pseudocode or algorithm blocks with numbered steps or code-like formatting.
Open Source Code	Yes	The code is available at https://github.com/hekj/FDA.
Open Datasets	Yes	The visual environments are based on the photo-realistic dataset Matterport3d (Mp3d) [6]. Four datasets containing the instruction-trajectory pairs have been adopted: R2R [5], Rx R [29], CVDN [52] and REVERIE [45].
Dataset Splits	Yes	There are a total of 90 houses, with 61, 11, and 18 houses allocated for training/validation seen, validation unseen, and test splits, respectively.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general concepts like 'models trained with high-frequency information'.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be necessary for exact replication.
Experiment Setup	No	The paper describes the Frequency-enhanced Data Augmentation (FDA) method and its application, but it does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings.