Visual Semantic Navigation using Scene Priors
Authors: Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, Roozbeh Mottaghi
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show how semantic knowledge improves performance significantly. More importantly, we show improvement in generalization to unseen scenes and/or objects. |
| Researcher Affiliation | Collaboration | Wei Yang1, Xiaolong Wang2, Ali Farhadi4,5, Abhinav Gupta2,3, Roozbeh Mottaghi5 1 The Chinese University of Hong Kong 2 Carnegie Mellon University 3 Facebook AI Research 4 University of Washington 5 Allen Institute for AI |
| Pseudocode | No | The paper provides architectural diagrams and textual descriptions of the model and GCN integration, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for their methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | To evaluate our model, we use the AI2-THOR framework (Kolve et al., 2017), which provides near photo-realistic customizable environments. ... We use the Visual Genome (Krishna et al., 2017) dataset as a source to build the knowledge graph. |
| Dataset Splits | Yes | We randomly split the scenes into three splits for each room category, i.e., 20 training rooms, 5 validation rooms, and 5 testing rooms. |
| Hardware Specification | Yes | Our method is implemented in Tensorflow (Abadi et al., 2015) and the actor-critic policy network is trained with a single NVIDIA GeForce GTX Titan X GPU with 20 threads for 10 million frames for experiments without stop action, and for 25 million frames for experiments with stop action. |
| Software Dependencies | No | The paper states, 'Our method is implemented in Tensorflow (Abadi et al., 2015)', and 'The network parameters are optimized by the RMSProp optimizer (Tieleman & Hinton)'. While the framework and optimizer are named, specific version numbers for TensorFlow or any other software dependencies are not provided. |
| Experiment Setup | Yes | The initial learning rate is set empirically as 7e-4, and is decreased linearly as the training progresses. The maximum number of steps is set to 100 for kitchen, bedroom and bathroom, and to 200 for living room due to the larger exploration space. |