End-to-end optimization of goal-driven and visually grounded dialogue systems
Authors: Florian Strub, Harm de Vries, Jérémie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then quantitatively and qualitatively compare the performance of our system to a supervised approach on the same task. In short, our contributions are: to propose an original visually grounded goal-directed dialogue system optimized via Deep RL; to achieve 10% improvement on task completion over a supervised learning baseline. [...] We report the accuracies of the QGen trained with REINFORCE and CE in Table 2. |
| Researcher Affiliation | Collaboration | Florian Strub 1 Harm de Vries 2, Jeremie Mary1 , Bilal Piot 3, Aaron Courvile 2, Olivier Pietquin 3 1 Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRISt AL 2 University of Montreal 3 Deep Mind |
| Pseudocode | Yes | Algorithm 1: Training of QGen with REINFORCE |
| Open Source Code | Yes | Source code available at: https://guesswhat.ai |
| Open Datasets | Yes | To do so, we start from a corpus of 150k human-human dialogues collected via the recently introduced Guess What?! game [de Vries et al., 2016]. [...] We used the Guess What?! dataset that includes 155,281 dialogues containing 821,955 question/answer pairs composed of 4900 words on 66,537 unique images and 609,543 objects. |
| Dataset Splits | No | The paper mentions 'training set' and 'test set' (implicitly, via 'New Images' and evaluating on the test set), but no explicit 'validation' set or specific percentages for dataset splits are provided. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions models and techniques (e.g., VGG16, LSTM, SGD) but does not provide specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions or library versions). |
| Experiment Setup | Yes | We then initialize our environment with the pre-trained models and train the QGen with REINFORCE for 80 epochs with plain stochastic gradient descent (SGD) with a learning rate of 0.001 and a batch size of 64. [...] Finally, we set the maximum number of questions to 8 and the maximum number of words to 12. |