MixPoet: Diverse Poetry Generation via Learning Controllable Mixed Latent Space
Authors: Xiaoyuan Yi, Ruoyu Li, Cheng Yang, Wenhao Li, Maosong Sun9450-9457
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results on Chinese poetry demonstrate that Mix Poet improves both diversity and quality against three state-of-the-art models. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Technology, Tsinghua University Institute for Artificial Intelligence, Tsinghua University State Key Lab on Intelligent Technology and Systems, Tsinghua University 2Beijing University of Posts and Telecommunications 36ESTATES PTE LTD, Singapore |
| Pseudocode | Yes | Algorithm 1 Training Process of Mix Poet-AUS |
| Open Source Code | No | Mix Poet will be incorporated into Jiuge, the THUNLP online poetry generation system (https://jiuge.thunlp.cn). This indicates the model will be deployed in a system, but not that the source code for the methodology described in the paper is openly provided or linked for general access. |
| Open Datasets | No | We mainly experiment on two typical factors: living experience and historical background... Then we build a labelled corpus called Chinese Quatrain Corpus with Factors (CQCF), which contains 49,451 poems... Besides, we also collect a Chinese Quatrain Corpus (CQC) as unlabelled data which comprises 117,392 poems. While the paper describes its datasets, it does not provide concrete access information (link, DOI, repository name, or formal citation to an established public dataset with access details) for public access. |
| Dataset Splits | Yes | For CQC, we randomly select 4,500 poems for validation and testing, respectively, and the rest for training. For CQCF, we use 5% for validation, 5% for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only describes software settings. |
| Software Dependencies | No | The paper mentions 'Adam (Kingma and Ba 2015)' as the optimization method and 'leaky Re LU' and 'tanh' as activation functions, but it does not specify any software dependencies with version numbers (e.g., PyTorch, TensorFlow, scikit-learn, with their versions). |
| Experiment Setup | Yes | We set the sizes of hidden state, context vector, latent variable, word embedding and factor embedding to 512, 512, 256, 256 and 64 respectively. The activation function is leaky Re LU for the discriminator and prior networks and is tanh for others. d = 3 in Eq.(4); α = β = 1 in Eq.(8). Adam (Kingma and Ba 2015) with mini-batches (batch size=128) is used for optimization. To avoid overfitting, we also adopt dropout and l2 norm regularization. For Mix Poet AUS, we update the discriminator five times per update of other parts. |