Context Aware Conversational Understanding for Intelligent Agents With a Screen
Authors: Vishal Naik, Angeliki Metallinou, Rahul Goel
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that this approach outperforms a rule-based alternative, and can be extended in a straightforward manner to new contextual use cases. We perform detailed evaluation of contextual and non-contextual use cases and show that our system displays accurate contextual behavior without degrading the performance of noncontextual user requests. |
| Researcher Affiliation | Collaboration | Vishal Ishwar Naik, 1 Arizona State University, 2 Amazon Alexa Machine Learning vnaik1@asu.edu, {ametalli, goerahul }@amazon.com |
| Pseudocode | No | The paper describes models and architectures with mathematical notations and figures but does not include explicit pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper mentions 'word2vec pre-trained embeddings of size E=300 downloaded from (Mikolov )' with a URL to a Google Code Archive for word2vec, but this is a third-party resource they used, not their own source code for the methodology described in the paper. There is no explicit statement or link for their own code. |
| Open Datasets | No | We use a large set of non contextual utterances from user interactions with Alexa on devices without a screen, and a smaller dataset of contextual utterances from user interactions where a screen is available. ... The dataset used for these experiments is a fraction of our production data and covers a range of domain functionality including music, books, movies and showtimes, videos, calendar events, local search, shopping, and general commands. |
| Dataset Splits | Yes | We used a train-dev split of 70%-30%, where the dev set was used for optimizing the deep learning models as well as tuning the reranker parameters α, β. |
| Hardware Specification | No | The paper mentions 'availability of fast GPU computing resources' in the related work section but does not specify the particular hardware (e.g., GPU model, CPU type) used for their experiments. |
| Software Dependencies | No | The paper mentions deep learning models (LSTMs, CNNs) and word2vec embeddings but does not provide specific software dependencies with version numbers (e.g., Python, TensorFlow/PyTorch versions, specific library versions). |
| Experiment Setup | Yes | All our deep learning models were trained end-to-end using stochastic gradient descent. Models were regularized using dropout and L2. We used word2vec pre-trained embeddings of size E=300 downloaded from (Mikolov ). For the context-Bi LSTM(keys,values) model we use a hidden layer of size H between the context value feature vector and the Bi LSTM, see Fig. 1b. Empirically, we chose H = 200 while the input value vector is of size N E = 5 300. Value inputs at each of the 5 positions are shuffled between training epochs, as described in 5.2, which led to better performance. For the context-CNN+Bi LSTM(keys,values), we used F = 100 filters for the convolutional layer. We also found that adding a Re Lu non-linearity after the max pooling operation performs slightly better. For the non-context Bi LSTM + contextual Reranker, the reranker parameters α and β were chosen based on grid search in [0, 1], to optimize the dev set performance for slots and intents. |