reproducibilityindex.ai

Target-Dependent Twitter Sentiment Classification with Rich Automatic Features

Authors: Duy-Tin Vo, Yue Zhang

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on a standard data set show that the proposed method outperforms the method of Dong et al. [2014] by 4.8% absolute accuracies, giving the best reported performance on the task. We perform a set of development experiments to evaluate the effectiveness of embeddings, context patterns, pooling functions, and sentiment lexicons on the performance of the proposed approach, tuning parameter values for our ﬁnal model.
Researcher Affiliation	Academia	Duy-Tin Vo and Yue Zhang Singapore University of Technology and Design 8 Somapah Road, Singapore 487372
Pseudocode	No	Information insufficient. The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	Information insufficient. The paper references third-party tools like 'word2vec package' and 'Lib Linear' with URLs, but does not provide access to the source code for the methodology described in this paper.
Open Datasets	Yes	Our experiments are carried out on the target-dependent data set of Dong et al. [2014], which is manually annotated with sentiment labels (negative, positive, and neutral) toward given targets (such as bill gates , google and xbox ). We use the SSWE data3 to obtain SSWE embeddings. We use three sentiment lexicons, namely MPQA4 [Wilson et al., 2005], HL5 [Hu and Liu, 2004], and NRC emotion lexicon6 [Mohammad and Yang, 2011], integrating them to ﬁlter the context.
Dataset Splits	Yes	The data set includes 6248 training tweets and 692 testing tweets, with a balanced number of positive, negative, and neutral tweets (25%, 25%, and 50%, respectively). For tuning of a ﬁnal three-way classiﬁcation model, we perform ﬁve-fold cross validation on the training data to adjust features and the penalty parameter C.
Hardware Specification	No	Information insufficient. The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	Information insufficient. The paper mentions 'word2vec package' and 'Lib Linear' but does not specify their version numbers, which is necessary for reproducible ancillary software details.
Experiment Setup	Yes	To learn distributed word representations using the word2vec package7, we empirically choose 100, 3, and 10 for the embedding size, window length, and word count threshold, respectively. For tuning of a ﬁnal three-way classiﬁcation model, we perform ﬁve-fold cross validation on the training data to adjust features and the penalty parameter C.