Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

Authors: Ganbin Zhou, Ping Luo, Rongyu Cao, Yijun Xiao, Fen Lin, Bo Chen, Qing He

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that the proposed X2TREE framework outperforms baseline methods over 11.15% increase of acceptance ratio.
Researcher Affiliation Collaboration Ganbin Zhou,1,2 Ping Luo,1,2 Rongyu Cao,1,2 Yijun Xiao,3 Fen Lin,4 Bo Chen,4 Qing He1,2 1Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China.{zhouganbin, luop, heqing}@ict.ac.cn 2University of Chinese Academy of Sciences, Beijing 100049, China. 3Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, USA. 4We Chat Search Application Department, Tencent, China.
Pseudocode Yes Algorithm 1 CANONICALIZE and Algorithm 2 GENERALIZEDBEAMSEARCH are provided.
Open Source Code No The paper does not provide an explicit statement about the release of its source code or a link to a code repository for the described methodology.
Open Datasets No The paper states, "14 million post-response pairs were obtained from Tencent Weibo." and provides a URL: "http://t.qq.com/?lang=en US". However, this is a general social media platform, not a direct link to the dataset itself, nor is it a formal citation with authors and year. Therefore, it does not provide concrete access to a publicly available dataset.
Dataset Splits Yes After removing spams and advertisements, 815, 852 pairs were left, among which 775, 852 are for training, and 40, 000 for model validation.
Hardware Specification Yes Our implementations are based on the Theano library (Bastien et al. 2012) over NVIDIA K80 GPU.
Software Dependencies No The paper mentions "Theano library (Bastien et al. 2012)" but does not specify a version number for it or other software components beyond general programming language versions.
Experiment Setup Yes We applied one-layer GRU (Cho et al. 2014) with 1,024-dimensional hidden states to {fk}K k=1 and all baseline models. As suggested in (Shang, Lu, and Li 2015), the word embeddings for the encoders and decoders are learned separately, whose dimensions are set to 128 for all models. All the parameters were initialized using a uniform distribution between -0.01 and 0.01. In training, the mini-batch size is 128. We used ADADELTA (Zeiler 2012) for optimization. The training stops if the perplexity on the validation set increases for 4 consecutive epochs. When generating responses, for X2TREE we use generalized beam search with global beam size G = 6, local beam size L = 6. For other X2SEQ baseline models, conventional beam search with beam size 200 is used.