Automated Rule Selection for Aspect Extraction in Opinion Mining

Authors: Qian Liu, Zhiqiang Gao, Bing Liu, Yuanlin Zhang

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiment results show that the proposed method can select a subset of a given rule set to achieve significantly better results than the full rule set and the existing state-of-the-art CRF-based supervised method.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Southeast University, China 2Key Laboratory of Computer Network and Information Integration (Southeast University) Ministry of Education, China 3Department of Computer Science, University of Illinois at Chicago, USA 4Department of Computer Science, Texas Tech University, USA
Pseudocode Yes Algorithm 1 RS-DP
Open Source Code No The paper does not include an unambiguous statement about releasing the source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets Yes One is from [Hu and Liu, 2004], which contains five review datasets of four domains: digital cameras (D1, D2), cell phone (D3), MP3 player (D4), and DVD player (D5). The first collection has been widely used in aspect extraction evaluation by researchers [Hu and Liu, 2004; Popescu and Etzioni, 2005; Qiu et al., 2011; Liu et al., 2013a]. For seed opinion words, we used all (and only) the adjective opinion words in the opinion lexicon of [Hu and Liu, 2004]1. 1http://www.cs.uic.edu/ liub/FBS/sentiment-analysis.html
Dataset Splits Yes In testing RS-DP, RS-DP+, CRF and CRF+, to reflect cross domain aspect extraction, we use leave-one-out cross validation for D1 to D5, i.e., the algorithm selects rules based on the annotated data from four products, and tests the selected rules using the unseen data from the remaining product; for D6 to D8, the algorithm selects rules based on the annotated data from D1 to D5, and tests the selected rules using each of the data from D6 to D8.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions using 'Stanford Parser' for part-of-speech tagging and dependency parsing, but it does not specify a version number or any other software dependencies with their versions.
Experiment Setup No The paper describes the rule selection algorithm and evaluation metrics used, but it does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or other concrete training configurations.