End-to-end optimization of goal-driven and visually grounded dialogue systems

Authors: Florian Strub, Harm de Vries, Jérémie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then quantitatively and qualitatively compare the performance of our system to a supervised approach on the same task. In short, our contributions are: to propose an original visually grounded goal-directed dialogue system optimized via Deep RL; to achieve 10% improvement on task completion over a supervised learning baseline. [...] We report the accuracies of the QGen trained with REINFORCE and CE in Table 2.
Researcher Affiliation Collaboration Florian Strub 1 Harm de Vries 2, Jeremie Mary1 , Bilal Piot 3, Aaron Courvile 2, Olivier Pietquin 3 1 Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRISt AL 2 University of Montreal 3 Deep Mind
Pseudocode Yes Algorithm 1: Training of QGen with REINFORCE
Open Source Code Yes Source code available at: https://guesswhat.ai
Open Datasets Yes To do so, we start from a corpus of 150k human-human dialogues collected via the recently introduced Guess What?! game [de Vries et al., 2016]. [...] We used the Guess What?! dataset that includes 155,281 dialogues containing 821,955 question/answer pairs composed of 4900 words on 66,537 unique images and 609,543 objects.
Dataset Splits No The paper mentions 'training set' and 'test set' (implicitly, via 'New Images' and evaluating on the test set), but no explicit 'validation' set or specific percentages for dataset splits are provided.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the experiments are provided in the paper.
Software Dependencies No The paper mentions models and techniques (e.g., VGG16, LSTM, SGD) but does not provide specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions or library versions).
Experiment Setup Yes We then initialize our environment with the pre-trained models and train the QGen with REINFORCE for 80 epochs with plain stochastic gradient descent (SGD) with a learning rate of 0.001 and a batch size of 64. [...] Finally, we set the maximum number of questions to 8 and the maximum number of words to 12.