User Modeling with Neural Network for Review Rating Prediction

Authors: Duyu Tang, Bing Qin, Ting Liu, Yuekui Yang

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments to evaluate the effectiveness of the proposed method for review rating prediction. We use two benchmark datasets: one from movie reviews in Rotten Tomatoes and another from restaurant reviews in Yelp Dataset Challenge 2013. Extensive experimental results show that (1) the proposed method outperforms several strong baseline methods which only use textual semantics; (2) for the task of review rating prediction, matrix-vector multiplication is more effective to model user-word composition than vector concatenation or addition methods. The main contributions presented in this work are listed as follows: We represent user-word composition as matrix-vector multiplication, regarding each user as a matrix that modifies the meaning of a certain word. To our knowledge, this is the first neural network method that incorporates user information for review rating prediction. We report empirical results on two benchmark datasets. The proposed method performs better than strong baseline methods on the Yelp dataset.
Researcher Affiliation Collaboration Harbin Institute of Technology, Harbin, China Intelligent Computing and Search Lab, Tencent, Shenzhen, China {dytang, qinb, tliu}@ir.hit.edu.cn, yuekuiyang@tencent.com
Pseudocode No The paper describes the model and calculations in prose and equations, but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper links to word2vec (https://code.google.com/p/word2vec/), which is a third-party tool used by the authors, not their own source code for the proposed method. There is no explicit statement or link providing access to the authors' own implementation code.
Open Datasets Yes We conduct experiments on two benchmark datasets, Yelp13 and RT05. Yelp13 is a large-scale dataset consisting of restaurant reviews from Yelp. It is released by the third round of the Yelp Dataset Challenge in 2013. RT05 is a movie review dataset downloaded from Rotten Tomatoes. The statistical information of Yelp13 and RT05 are detailed in Table 1.
Dataset Splits Yes On Yelp13, we split the original corpus into train, dev and test sets with a 80:10:10 split. We train the rating predictor on the training set, tune parameters on the dev set and evaluate on the test set. On RT05, we use 10-fold cross-validation as in previous studies.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using "word2vec" to learn word vectors and "softmax" for prediction, and compares against baselines using "Supported Vector Machine (SVM)" and "Liblinear". However, no specific version numbers are provided for any of these software components or libraries.
Experiment Setup Yes We empirically set the vector dimension d as 100, the rank of user matrix r as 3. The values of W, b and u are randomly initialized with the fan-in trick. We use dropout [Srivastava et al., 2014] to avoid the neural network being over-fitting. Hyper parameters are tuned on the development dataset.