Source-Target Inference Models for Spatial Instruction Understanding
Authors: Hao Tan, Mohit Bansal
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our models achieve substantial improvements over previous work: 47% on source block selection accuracy and 22% on target position mean distance. 4 Experimental Setup 5 Results and Analysis 5.1 Ablation Results |
| Researcher Affiliation | Academia | Hao Tan, Mohit Bansal Department of Computer Science University of North Carolina at Chapel Hill {haotan, mbansal}@cs.unc.edu |
| Pseudocode | No | The paper describes methods in text and equations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the described methodology. |
| Open Datasets | Yes | We employ the challenging blank-labeled dataset introduced in Bisk, Yuret, and Marcu (2016) |
| Dataset Splits | Yes | We use the standard training/dev/test splits from Bisk, Yuret, and Marcu (2016), and use the dev set for all hyperparameter tuning. |
| Hardware Specification | No | The paper mentions 'Nvidia GPU awards' in the acknowledgments but does not provide specific details such as GPU models, CPU models, or other hardware specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like 'LSTM-RNN' and 'Adam optimizer' but does not provide specific version numbers for any software, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | The sentence encoder is an LSTM-RNN with 256-dimensional hidden vectors and word embeddings. The block embedding layer is a 64-dimensional fully-connected layer. We use a generalized (adapted to our approach) Xavier initialization (Glorot and Bengio 2010) to keep the variance (energy) of each feature map constant across layers, which stabilizes the training process. The Adam optimizer (Kingma and Ba 2014) is used to update the parameters, and the learning rate is fixed at 0.001. Gradient clipping (Pascanu, Mikolov, and Bengio 2013) is applied to the LSTM parameters to avoid exploding gradients. For our annealment-based sampling approach (Sec. 3.2), we start from the expectation loss, then sample N = 20 (which approximately matches the expectation loss), and then anneal it down to 1 (which is same as the one-block sampling loss). To speed up the training process, the initial annealing decay step is 5, which is then reduced to 2, and finally to 1. The final sequence of block samples N is {20, 15, 10, 8, 6, 5, 4, 3, 2, 1}. Regularization: To regularize the network, we use weight decay for all trainable variables, and a dropout layer of 0.2 probability is added before and after the LSTM layer. |