Chinese Song Iambics Generation with Neural Attention-Based Model

Authors: Qixin Wang, Tianyi Luo, Dong Wang, Chao Xing

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both the automatic and subjective evaluation results show that our model indeed can learn the complex structural and rhythmic patterns of Song iambics, and the generation is rather successful.
Researcher Affiliation Collaboration 1CSLT, RIIT, Tsinghua University, China 2Tsinghua National Lab for Information Science and Technology, Beijing, China 3Huilan Limited, Beijing, China 4CIST, Beijing University of Posts and Telecommunications, China
Pseudocode No The paper does not contain pseudocode or a clearly labeled algorithm block.
Open Source Code No The paper mentions using a 'word2vec tool1' with a URL (https://code.google.com/archive/p/word2vec/), but this refers to a third-party tool used for initialization, not the open-source code for the methodology described in the paper.
Open Datasets No The paper mentions using a 'Song iambics corpus (Songci)' collected from the Internet and the 'Gigaword corpus', but it does not provide concrete access information (link, DOI, repository, or formal citation with author/year for public access) for these datasets.
Dataset Splits No The paper specifies training and test sets ('15, 001 are used for training and 688 are used for test.') but does not mention a separate validation split.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments are provided in the paper.
Software Dependencies No The paper mentions 'Moses tool [Koehn et al., 2007]' and 'Ada Delta algorithm [Zeiler, 2012]', but it does not provide specific version numbers for software components required to reproduce the experiment.
Experiment Setup Yes For the attention model, both the encoder and decoder involve a recurrent hidden layer that contains 500 hidden units, and a non-recurrent hidden layer that contains 600 units. A max-out non-linear layer is then employed to reduce the dimensionality to 300, followed by a linear transform to generate the output units that correspond to the possible Chinese characters. The model is trained with the Ada Delta algorithm [Zeiler, 2012], where the minibatch is set to be 60 sentences.