Dialog Policy Learning for Joint Clarification and Active Learning Queries

Authors: Aishwarya Padmakumar, Raymond J. Mooney13604-13612

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train a hierarchical dialog policy to jointly perform both clarification and active learning in the context of an interactive language-based image retrieval task motivated by an online shopping application, and demonstrate that jointly learning dialog policies for clarification and active learning is more effective than the use of static dialog policies for one or both of these functions. [...] Table 1: Results from the final batch of the test phase.
Researcher Affiliation Collaboration Aishwarya Padmakumar, 1 Raymond J. Mooney 2 1 Amazon Alexa AI * 2 University of Texas at Austin
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (e.g., labeled 'Pseudocode' or 'Algorithm').
Open Source Code No The paper does not provide an explicit statement about releasing the source code for the methodology, nor does it include a link to a code repository.
Open Datasets Yes To address a potential shopping application, we simulate dialogs using the i Materialist Fashion Attribute data (Guo et al. 2019), consisting of images from the shopping site Wish1 annotated for a set of 228 attributes.
Dataset Splits Yes We divided the data into 4 splits, policy pretrain, policy train, policy val and policy test, such that each contains images that have attributes for which positive examples are not present in earlier splits to increase the potential benefit of active learning. [...] Each of these is then split into subsets classifier training and classifier test by a uniform 60-40 split.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers (e.g., Python 3.8, PyTorch 1.9), needed to replicate the experiment.
Experiment Setup Yes We train the network using a loss function that combines cross-entropy loss on p(i) over all examples with the cross entropy loss over p (i) only for positive labels. [...] We find this more effective than a standard weighted cross entropy loss, and the results in this paper use λ = 0.9. [...] We initialize the policy with 4 batches of dialogs, followed by 4 batches of dialogs for the training phase, and 5 batches of dialogs in the testing phase.