Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
WebVLN: Vision-and-Language Navigation on Websites
Authors: Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Web VLNNet outperforms current VLN and web-related navigation methods. |
| Researcher Affiliation | Academia | Australian Institute for Machine Learning, The University of Adelaide EMAIL |
| Pseudocode | No | The paper describes the architecture and processes involved but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/Web VLN/Web VLN. |
| Open Datasets | No | Due to the lack of an off-the-shelf dataset for Web VLN, we have collected a new Web VLN-v1 dataset to facilitate research in this field. It comprises 8,990 records/paths with 14,825 QA pairs derived from three different shopping websites (aliased as SA, HB and ES). |
| Dataset Splits | Yes | We split 60% samples as training data, 10% samples as validation data and 30% samples as testing data (i.e., 8, 960/1, 262/4, 603). |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper mentions various models and tools like 'VLNBERT', 'BLIP-2', and 'Chat GPT with the gpt-3.5-turbo model', but does not provide specific version numbers for software dependencies or programming languages/libraries. |
| Experiment Setup | Yes | In all the experiments, we set the weighting hyperparameters η and λ equal to 1. |