Deep Short Text Classification with Knowledge Powered Attention

Authors: Jindong Chen, Yizhou Hu, Jingping Liu, Yanghua Xiao, Haiyun Jiang6252-6259

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.
Researcher Affiliation Collaboration 1Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, China 2CETC Big Data Research Institute Co.,Ltd., Guizhou, China 3Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China 4Shuyan Technology, Shanghai, China 5Alibaba Group, Zhejiang, China
Pseudocode No The paper does not contain pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement or a direct link to the source code for the methodology it describes.
Open Datasets Yes We conduct experiments on four datasets, as shown in Table 1. The first one is a Chinese Weibo emotion analysis2 dataset from NLPCC2013 (Zhou et al. 2017). ... The second one is product review3 dataset from NLPCC2014 (Zhou, Xu, and Gui 2017). ... The third one is the Chinese news title4 dataset with 18 classes ... from NLPCC2017 (Qiu, Gong, and Huang 2017). ... The Topic dataset is collected from Sogou news (Fu et al. 2015). Footnotes 2, 3, 4, and 6 provide URLs to these datasets: 2http://tcci.ccf.org.cn/conference/2013/pages/page04 sam.html 3http://tcci.ccf.org.cn/conference/2014/pages/page04 sam.html 4http://tcci.ccf.org.cn/conference/2017/taskdata.php 6http://www.sogou.com/labs/resource/list news.php
Dataset Splits Yes Datasets # Class Training/Validation/Test set Avg. Chars Avg. Words Avg. Ent Avg. Con Weibo 7 3771/665/500 ... Product Review 2 7648/1350/1000 ... News Title 18 154999/27300/10000 ... Topic 20 6170/1090/700
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments with specific model numbers or types.
Software Dependencies No The paper mentions 'Adam (Kingma and Ba 2014)' and 'jieba tool5' (with a GitHub link), but does not provide specific version numbers for these or other key software components, which is required for reproducibility.
Experiment Setup Yes For all models, we use Adam (Kingma and Ba 2014) for learning, with a learning rate of 0.01. The batch size is set to 64. The training epochs are set to 20. We use 50-dimension skip-gram character and word embedding... We use 1D CNN with filters of width [2,3,4] of size 50 for a total of 150. For our model, the following hyper-parameters are estimated based on the validation set and used in the final test set: u = 64, da = 70, db = 35.