Curriculum Learning for Vision-and-Language Navigation
Authors: Jiwen Zhang, zhongyu wei, Jianqing Fan, Jiajie Peng
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method is model-agnostic and can significantly improve the performance, the generalizability, and the training efficiency of current state-of-the-art navigation agents without increasing model complexity. |
| Researcher Affiliation | Academia | 1School of Data Science, Fudan University, China 2Research Institute of Intelligent and Complex Systems, Fudan University, China 3Department of Operations Research and Financial Engineering, Princeton University, USA |
| Pseudocode | Yes | Algorithm 1 Self-paced Curriculum Learning |
| Open Source Code | No | The paper mentions supplementary materials and that they reproduced agents in a unified code framework, but it does not provide an explicit statement about releasing their own source code or a link to a repository for the methodology described. |
| Open Datasets | Yes | We develop the principle of curriculum design and re-arrange the benchmark Room-to-Room (R2R) dataset to make it suitable for curriculum training. Other related VLN datasets include Touchdown dataset (Chen et al., 2019), ... and CVDN dataset (Thomason et al., 2019)... Rx R dataset (Ku et al., 2020), a multilingual VLN dataset built upon Matterport3D simulator (Chang et al., 2017). |
| Dataset Splits | Yes | We leave validation and test set of R2R dataset unchanged so as to have comparable experimental results. We experiment with three training paradigms. They include the traditional machine learning (ML) strategy that training by uniformly sample mini-batches from R2R dataset, a naive curriculum learning (NCL) strategy that the agent is firstly trained on CLR2R round 1 split, then round 1~2 splits, and gradually trained on the whole CLR2R train set, and previously introduced self-paced curriculum learning (SPCL) strategy. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU models, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as programming language versions or library versions, that would be required for replication. |
| Experiment Setup | Yes | For self-paced functions, we choose the binary and linear scheme. For Env Drop agent... we only use the ground truth trajectory-based loss to update the weight variable. We set ai = i for samples in CLR2R round i split. The constant c is chosen within range [0.95 a 1, a 1]. We enforce the initial weight for round 1 and 2 samples as one. Then w0 represents the initial weight for samples in round 3~5 splits. The number of training iterations is fixed at 80,000 for line (3)~(5) in table. We restrict the beam size as 5. |