Recurrent Convolutional Neural Networks for Text Classification
Authors: Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on four commonly used datasets. The experimental results show that the proposed method outperforms the state-of-the-art methods on several datasets, particularly on document-level datasets. |
| Researcher Affiliation | Academia | Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences, China {swlai, lhxu, kliu, jzhao}@nlpr.ia.ac.cn |
| Pseudocode | No | The paper includes a network structure diagram (Figure 1), but no pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We perform the experiments using the following four datasets: 20Newsgroups, Fudan Set, ACL Anthology Network, and Sentiment Treebank. The paper provides URLs for each dataset: 'qwone.com/ jason/20Newsgroups/', 'www.datatang.com/data/44139 and 43543', 'old-site.clsp.jhu.edu/ sbergsma/Stylo/', 'nlp.stanford.edu/sentiment/'. |
| Dataset Splits | Yes | Table 1 provides detailed information about each dataset, including Train/Dev/Test set entries (e.g., 20News: 7520/836/5563, Fudan: 8823/981/9832, ACL: 146257/28565/28157, SST: 8544/1101/2210). Additionally, it states: 'The ACL and SST datasets have a pre-defined training, development and testing separation. For the other two datasets, we split 10% of the training set into a development set and keep the remaining 90% as the real training set.' |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'Stanford Tokenizer', 'ICTCLAS', and 'word2vec' but does not specify their version numbers, which is required for reproducibility. |
| Experiment Setup | Yes | We set the learning rate of the stochastic gradient descent α as 0.01, the hidden layer size as H = 100, the vector size of the word embedding as |e| = 50 and the size of the context vector as |c| = 50. |