SIRI: Spatial Relation Induced Network For Spatial Description Resolution

Authors: peiyao wang, Weixin Luo, Yanyu Xu, Haojie Li, Shugong Xu, Jianyu Yang, Shenghua Gao

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Touchdown show that our method is around 24% better than the state-of-the-art method in terms of accuracy, measured by an 80-pixel radius.
Researcher Affiliation Academia Peiyao Wang Weixin Luo Yanyu Xu Shanghai Tech University {wangpy, luowx, xuyy2}@shanghaitech.edu.cn Haojie Li Dalian University of Technology hjli@dlut.edu.cn Shugong Xu Shanghai University shugong@shu.edu.cn Jianyu Yang Soochow Univerisity jyyang@suda.edu.cn Shenghua Gao gaoshh@shanghaitech.edu.cn
Pseudocode No The paper describes the system architecture and components but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes The code for this project is publicly available at https://github.com/wong-puiyiu/siri-sdr.1
Open Datasets Yes We conducted all experiments on the Touch Down dataset (3), which is designed for navigation and spatial description reasoning in a real-life environment.
Dataset Splits Yes In total, this dataset contains 27, 575 samples for SDR, including 17, 878 training samples, 3, 836 validation samples and 3, 859 testing samples.
Hardware Specification Yes All the experiments are conducted with a Ge Force GTX TITAN X.
Software Dependencies No The code is implemented in Pytorch.
Experiment Setup Yes In addition, the number of training mini-batches and the learning rate are 10 and 0.0001 respectively.