reproducibilityindex.ai

Identifying Sentiment Words Using an Optimization Model with L1 Regularization

Authors: Zhi-Hong Deng, Hongliang Yu, Yunlun Yang

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments on the real datasets show that ISOMER outperforms the classic approaches, and that the lexicon learned by ISOMER can be effectively adapted to document-level sentiment analysis.
Researcher Affiliation	Academia	Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China Language Technologies Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
Pseudocode	No	The paper describes the solution process by following a sub-gradient method and outlining iteration steps, but it does not present a formally structured pseudocode block or algorithm figure.
Open Source Code	No	The paper does not provide any specific links to open-source code or explicit statements about its public release.
Open Datasets	Yes	The Cornell Movie Review Data 1, ﬁrst used in (Pang, Lee, and Vaithyanathan 2002), is a widely used benchmark. This corpus contains 1,000 positive and 1,000 negative processed reviews of movies, extracted from the Internet Movie Database. The other corpus is the Stanford Large Movie Review Dataset2 (Maas et al. 2011). (Maas et al. 2011) constructed a collection of 50,000 reviews from IMDB, half of which are positive reviews and half negative. We use MPQA subjective lexicon 3 to generate the gold standard.
Dataset Splits	No	The paper mentions '10-fold cross-validation' for the document-level sentiment classification task, but it does not provide specific train/validation/test splits for the main sentiment word identification problem that ISOMER addresses. It mentions randomly selecting seed words and candidate words, but not explicit dataset splits for model training and validation.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct the experiments.
Software Dependencies	No	The paper mentions using 'the word segmentation tool ICTCLAS 5' and 'SVM classiﬁer 6' but does not specify version numbers for these software dependencies. It only provides a general reference for the SVM classifier (a URL to libsvm).
Experiment Setup	Yes	We adopt the above settings in our experiments. In our model, the tuning parameter β determines the proportion of selected sentiment words in the candidate set, called density . As β increases, the regularizer tends to select fewer and more signiﬁcant words. For convenience of comparing with other methods, we choose β for each dataset which enables the density to approximately equal to the real value, i.e. β = 2 10 4 for Stanford and Chinese datasets and β = 9 10 4 for Cornell dataset. ... TF-IDF is used as the word weighting scheme to compute fij in our model.