Curriculum Learning for Vision-and-Language Navigation

Authors: Jiwen Zhang, zhongyu wei, Jianqing Fan, Jiajie Peng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our method is model-agnostic and can significantly improve the performance, the generalizability, and the training efficiency of current state-of-the-art navigation agents without increasing model complexity.
Researcher Affiliation Academia 1School of Data Science, Fudan University, China 2Research Institute of Intelligent and Complex Systems, Fudan University, China 3Department of Operations Research and Financial Engineering, Princeton University, USA
Pseudocode Yes Algorithm 1 Self-paced Curriculum Learning
Open Source Code No The paper mentions supplementary materials and that they reproduced agents in a unified code framework, but it does not provide an explicit statement about releasing their own source code or a link to a repository for the methodology described.
Open Datasets Yes We develop the principle of curriculum design and re-arrange the benchmark Room-to-Room (R2R) dataset to make it suitable for curriculum training. Other related VLN datasets include Touchdown dataset (Chen et al., 2019), ... and CVDN dataset (Thomason et al., 2019)... Rx R dataset (Ku et al., 2020), a multilingual VLN dataset built upon Matterport3D simulator (Chang et al., 2017).
Dataset Splits Yes We leave validation and test set of R2R dataset unchanged so as to have comparable experimental results. We experiment with three training paradigms. They include the traditional machine learning (ML) strategy that training by uniformly sample mini-batches from R2R dataset, a naive curriculum learning (NCL) strategy that the agent is firstly trained on CLR2R round 1 split, then round 1~2 splits, and gradually trained on the whole CLR2R train set, and previously introduced self-paced curriculum learning (SPCL) strategy.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU models, or memory specifications.
Software Dependencies No The paper does not provide specific software dependencies with version numbers, such as programming language versions or library versions, that would be required for replication.
Experiment Setup Yes For self-paced functions, we choose the binary and linear scheme. For Env Drop agent... we only use the ground truth trajectory-based loss to update the weight variable. We set ai = i for samples in CLR2R round i split. The constant c is chosen within range [0.95 a 1, a 1]. We enforce the initial weight for round 1 and 2 samples as one. Then w0 represents the initial weight for samples in round 3~5 splits. The number of training iterations is fixed at 80,000 for line (3)~(5) in table. We restrict the beam size as 5.