Toward Diverse Text Generation with Inverse Reinforcement Learning

Authors: Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results demonstrate that our proposed method can generate higher quality texts than the previous methods.
Researcher Affiliation Academia Zhan Shi, Xinchi Chen, Xipeng Qiu , Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, Fudan University School of Computer Science, Fudan University
Pseudocode Yes Algorithm 1 IRL for Text Generation
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the proposed Inverse Reinforcement Learning (IRL) method. It only links to the code for baseline models (Seq GAN and Leak GAN).
Open Datasets Yes We experiment on three corpora: the synthetic oracle dataset [Yu et al., 2017], the COCO image caption dataset [Chen et al., 2015] and the IMDB movie review dataset [Diao et al., 2014].
Dataset Splits No The paper explicitly states splits for training and testing sets (e.g., '80,000 texts as training set, and another 5,000 as test set' for COCO), but does not specify a separate validation dataset split.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions the Adam optimizer but does not provide specific version numbers for any software dependencies or libraries used in the implementation.
Experiment Setup Yes Table 1 gives the experimental settings on the three corpora. It includes 'Embedding dimension 32 64 128', 'Hidden layer dimension 32 64 128', 'Batch size 64 128', 'Optimizer & lr rate Adam, 0.005' for Text Generator and 'Drop out 0.75 0.45 0.75', 'Batch size 64 1024', 'Optimizer & lr rate Adam, 0.0004' for Reward Approximator.