Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation
Authors: Ganbin Zhou, Ping Luo, Rongyu Cao, Yijun Xiao, Fen Lin, Bo Chen, Qing He
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that the proposed X2TREE framework outperforms baseline methods over 11.15% increase of acceptance ratio. |
| Researcher Affiliation | Collaboration | Ganbin Zhou,1,2 Ping Luo,1,2 Rongyu Cao,1,2 Yijun Xiao,3 Fen Lin,4 Bo Chen,4 Qing He1,2 1Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China.{zhouganbin, luop, heqing}@ict.ac.cn 2University of Chinese Academy of Sciences, Beijing 100049, China. 3Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, USA. 4We Chat Search Application Department, Tencent, China. |
| Pseudocode | Yes | Algorithm 1 CANONICALIZE and Algorithm 2 GENERALIZEDBEAMSEARCH are provided. |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper states, "14 million post-response pairs were obtained from Tencent Weibo." and provides a URL: "http://t.qq.com/?lang=en US". However, this is a general social media platform, not a direct link to the dataset itself, nor is it a formal citation with authors and year. Therefore, it does not provide concrete access to a publicly available dataset. |
| Dataset Splits | Yes | After removing spams and advertisements, 815, 852 pairs were left, among which 775, 852 are for training, and 40, 000 for model validation. |
| Hardware Specification | Yes | Our implementations are based on the Theano library (Bastien et al. 2012) over NVIDIA K80 GPU. |
| Software Dependencies | No | The paper mentions "Theano library (Bastien et al. 2012)" but does not specify a version number for it or other software components beyond general programming language versions. |
| Experiment Setup | Yes | We applied one-layer GRU (Cho et al. 2014) with 1,024-dimensional hidden states to {fk}K k=1 and all baseline models. As suggested in (Shang, Lu, and Li 2015), the word embeddings for the encoders and decoders are learned separately, whose dimensions are set to 128 for all models. All the parameters were initialized using a uniform distribution between -0.01 and 0.01. In training, the mini-batch size is 128. We used ADADELTA (Zeiler 2012) for optimization. The training stops if the perplexity on the validation set increases for 4 consecutive epochs. When generating responses, for X2TREE we use generalized beam search with global beam size G = 6, local beam size L = 6. For other X2SEQ baseline models, conventional beam search with beam size 200 is used. |