WebVLN: Vision-and-Language Navigation on Websites
Authors: Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Web VLNNet outperforms current VLN and web-related navigation methods. |
| Researcher Affiliation | Academia | Australian Institute for Machine Learning, The University of Adelaide {qi.chen04, dileepa.pitawela, chongyang.zhao, gengze.zhou, tim.chen, qi.wu01}@adelaide.edu.au |
| Pseudocode | No | The paper describes the architecture and processes involved but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/Web VLN/Web VLN. |
| Open Datasets | No | Due to the lack of an off-the-shelf dataset for Web VLN, we have collected a new Web VLN-v1 dataset to facilitate research in this field. It comprises 8,990 records/paths with 14,825 QA pairs derived from three different shopping websites (aliased as SA, HB and ES). |
| Dataset Splits | Yes | We split 60% samples as training data, 10% samples as validation data and 30% samples as testing data (i.e., 8, 960/1, 262/4, 603). |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper mentions various models and tools like 'VLNBERT', 'BLIP-2', and 'Chat GPT with the gpt-3.5-turbo model', but does not provide specific version numbers for software dependencies or programming languages/libraries. |
| Experiment Setup | Yes | In all the experiments, we set the weighting hyperparameters η and λ equal to 1. |