Visual Semantic Navigation using Scene Priors

Authors: Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, Roozbeh Mottaghi

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show how semantic knowledge improves performance significantly. More importantly, we show improvement in generalization to unseen scenes and/or objects.
Researcher Affiliation Collaboration Wei Yang1, Xiaolong Wang2, Ali Farhadi4,5, Abhinav Gupta2,3, Roozbeh Mottaghi5 1 The Chinese University of Hong Kong 2 Carnegie Mellon University 3 Facebook AI Research 4 University of Washington 5 Allen Institute for AI
Pseudocode No The paper provides architectural diagrams and textual descriptions of the model and GCN integration, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing source code for their methodology, nor does it provide a link to a code repository.
Open Datasets Yes To evaluate our model, we use the AI2-THOR framework (Kolve et al., 2017), which provides near photo-realistic customizable environments. ... We use the Visual Genome (Krishna et al., 2017) dataset as a source to build the knowledge graph.
Dataset Splits Yes We randomly split the scenes into three splits for each room category, i.e., 20 training rooms, 5 validation rooms, and 5 testing rooms.
Hardware Specification Yes Our method is implemented in Tensorflow (Abadi et al., 2015) and the actor-critic policy network is trained with a single NVIDIA GeForce GTX Titan X GPU with 20 threads for 10 million frames for experiments without stop action, and for 25 million frames for experiments with stop action.
Software Dependencies No The paper states, 'Our method is implemented in Tensorflow (Abadi et al., 2015)', and 'The network parameters are optimized by the RMSProp optimizer (Tieleman & Hinton)'. While the framework and optimizer are named, specific version numbers for TensorFlow or any other software dependencies are not provided.
Experiment Setup Yes The initial learning rate is set empirically as 7e-4, and is decreased linearly as the training progresses. The maximum number of steps is set to 100 for kitchen, bedroom and bathroom, and to 200 for living room due to the larger exploration space.