reproducibilityindex.ai

Adaptively Multi-Objective Adversarial Training for Dialogue Generation

Authors: Xuemiao Zhang, Zhouxing Tan, Xiaoning Zhang, Yang Cao, Rui Yan

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on two real-world datasets show a significant improvement over the baselines.
Researcher Affiliation	Collaboration	Xuemiao Zhang1 , Zhouxing Tan1 , Xiaoning Zhang2 , Yang Cao4 and Rui Yan3 1School of Software & Microelectronics, Peking University 2Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences 3Wangxuan Institute of Computer Technology, Peking University 4Sense Time Research
Pseudocode	Yes	Algorithm 1 Training AMPGAN.
Open Source Code	No	The paper does not provide any explicit statement or link regarding the availability of its source code.
Open Datasets	Yes	Cornell Movie Dataset (denoted as S1) contains a large metadata-rich collection of ﬁctional conversations extracted from raw movie scripts. It consists of 220579 exchanges between 10292 pairs of characters in the movie, involving 9035 characters in 617 movies, a total of 304713 utterances. Open Subtitles Dataset (S2) is a well-known human-human scripted dialogue dataset. It is extracted from movie subtitles which are not speaker-aligned [Tiedemann, 2009].
Dataset Splits	No	The paper mentions using a 'validation set' for early stopping ('if the performance on the validation set has not improved for a long time, stop training and choose the checkpoint with the best performance'), but does not provide specific split percentages or sample counts for the training, validation, and test sets.
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU or GPU models used for running the experiments.
Software Dependencies	No	The paper mentions using specific tools like 'Adam optimizer' and 'Stanford Core NLP parser' but does not provide version numbers for these or any other software dependencies.
Experiment Setup	Yes	We set the training batch size to 128, the size of word embeddings and the graph node embeddings to 512, the hidden size of hidden layers of all encoders in all models to 256, and the number of all LSTM layers to 2, and GCN layer number of Dsyn to 1. We train all models using Adam optimizer [Kingma and Ba, 2015], and all the dropout rates to 0.7. For the generator Gθ, we use the learning rate decay strategy and set the initial learning rate to 0.1 and the decay factor to 0.99. We set the learning rate to 0.001 ﬁxedly for Drf and Dsyn. In pre-training, we ﬁrst pre-train Gθ 5000 iterations, then use Gθ to produce 2500 128 negative examples and sample the same amount of positive ones from the dataset correspondingly. We combine both to pre-train Drf and Dsyn. In MC search, we employ the experience [Li et al., 2017] that given a partially decoded s P , Gθ will keep sampling tokens in word distribution until decoding is complete. Repeat this process k times (k is set to 7) and obtain k sequences sharing a common preﬁx s P . The average of the corresponding k scores given by the discriminator is used as the reward.