Learning Out-of-Vocabulary Words in Intelligent Personal Agents

Authors: Avik Ray, Yilin Shen, Hongxia Jin

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both benchmark and custom datasets show our new parsers achieve significant accuracy gain on OOV words and phrases, and in the meanwhile learn OOV words while maintaining accuracy on previously supported instructions.
Researcher Affiliation Industry Avik Ray, Yilin Shen and Hongxia Jin Samsung Research America, Mountain View, California, USA {avik.r, yilin.shen, hongxia.jin}@samsung.com
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We consider three benchmark semantic parsing datasets as our base datasets. First dataset is the geographical queries dataset (GEO) having 880 queries. The second dataset is the job queries dataset (JOB) with 640 queries. The third airline queries dataset (ATIS) contain overall 5,410 queries (4,480 training, 480 validation, 450 test).
Dataset Splits Yes The third airline queries dataset (ATIS) contain overall 5,410 queries (4,480 training, 480 validation, 450 test).
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU models, or cloud computing instance types used for running the experiments.
Software Dependencies No The paper mentions using "Torch 7 tool" but does not specify version numbers for other relevant software dependencies such as programming languages or specific libraries.
Experiment Setup Yes We use pre-trained GloVe embeddings [Pennington et al., 2014] for all our models. We choose the LSTM hidden state dimension d {100, 200, 300}, and dropout rate in {.5, .4, .3, .2}. In argument transfer model we take γ [4, 6]. For paraphrase generation model we choose threshold τ [1, 0.9]. RMSProp was used as the optimization algorithm.