MixPoet: Diverse Poetry Generation via Learning Controllable Mixed Latent Space

Authors: Xiaoyuan Yi, Ruoyu Li, Cheng Yang, Wenhao Li, Maosong Sun9450-9457

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results on Chinese poetry demonstrate that Mix Poet improves both diversity and quality against three state-of-the-art models.
Researcher Affiliation Collaboration 1Department of Computer Science and Technology, Tsinghua University Institute for Artificial Intelligence, Tsinghua University State Key Lab on Intelligent Technology and Systems, Tsinghua University 2Beijing University of Posts and Telecommunications 36ESTATES PTE LTD, Singapore
Pseudocode Yes Algorithm 1 Training Process of Mix Poet-AUS
Open Source Code No Mix Poet will be incorporated into Jiuge, the THUNLP online poetry generation system (https://jiuge.thunlp.cn). This indicates the model will be deployed in a system, but not that the source code for the methodology described in the paper is openly provided or linked for general access.
Open Datasets No We mainly experiment on two typical factors: living experience and historical background... Then we build a labelled corpus called Chinese Quatrain Corpus with Factors (CQCF), which contains 49,451 poems... Besides, we also collect a Chinese Quatrain Corpus (CQC) as unlabelled data which comprises 117,392 poems. While the paper describes its datasets, it does not provide concrete access information (link, DOI, repository name, or formal citation to an established public dataset with access details) for public access.
Dataset Splits Yes For CQC, we randomly select 4,500 poems for validation and testing, respectively, and the rest for training. For CQCF, we use 5% for validation, 5% for testing.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only describes software settings.
Software Dependencies No The paper mentions 'Adam (Kingma and Ba 2015)' as the optimization method and 'leaky Re LU' and 'tanh' as activation functions, but it does not specify any software dependencies with version numbers (e.g., PyTorch, TensorFlow, scikit-learn, with their versions).
Experiment Setup Yes We set the sizes of hidden state, context vector, latent variable, word embedding and factor embedding to 512, 512, 256, 256 and 64 respectively. The activation function is leaky Re LU for the discriminator and prior networks and is tanh for others. d = 3 in Eq.(4); α = β = 1 in Eq.(8). Adam (Kingma and Ba 2015) with mini-batches (batch size=128) is used for optimization. To avoid overfitting, we also adopt dropout and l2 norm regularization. For Mix Poet AUS, we update the discriminator five times per update of other parts.