Target-Dependent Twitter Sentiment Classification with Rich Automatic Features
Authors: Duy-Tin Vo, Yue Zhang
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a standard data set show that the proposed method outperforms the method of Dong et al. [2014] by 4.8% absolute accuracies, giving the best reported performance on the task. We perform a set of development experiments to evaluate the effectiveness of embeddings, context patterns, pooling functions, and sentiment lexicons on the performance of the proposed approach, tuning parameter values for our final model. |
| Researcher Affiliation | Academia | Duy-Tin Vo and Yue Zhang Singapore University of Technology and Design 8 Somapah Road, Singapore 487372 |
| Pseudocode | No | Information insufficient. The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | Information insufficient. The paper references third-party tools like 'word2vec package' and 'Lib Linear' with URLs, but does not provide access to the source code for the methodology described in this paper. |
| Open Datasets | Yes | Our experiments are carried out on the target-dependent data set of Dong et al. [2014], which is manually annotated with sentiment labels (negative, positive, and neutral) toward given targets (such as bill gates , google and xbox ). We use the SSWE data3 to obtain SSWE embeddings. We use three sentiment lexicons, namely MPQA4 [Wilson et al., 2005], HL5 [Hu and Liu, 2004], and NRC emotion lexicon6 [Mohammad and Yang, 2011], integrating them to filter the context. |
| Dataset Splits | Yes | The data set includes 6248 training tweets and 692 testing tweets, with a balanced number of positive, negative, and neutral tweets (25%, 25%, and 50%, respectively). For tuning of a final three-way classification model, we perform five-fold cross validation on the training data to adjust features and the penalty parameter C. |
| Hardware Specification | No | Information insufficient. The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | Information insufficient. The paper mentions 'word2vec package' and 'Lib Linear' but does not specify their version numbers, which is necessary for reproducible ancillary software details. |
| Experiment Setup | Yes | To learn distributed word representations using the word2vec package7, we empirically choose 100, 3, and 10 for the embedding size, window length, and word count threshold, respectively. For tuning of a final three-way classification model, we perform five-fold cross validation on the training data to adjust features and the penalty parameter C. |