Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

User Modeling with Neural Network for Review Rating Prediction

Authors: Duyu Tang, Bing Qin, Ting Liu, Yuekui Yang

IJCAI 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments to evaluate the effectiveness of the proposed method for review rating prediction. We use two benchmark datasets: one from movie reviews in Rotten Tomatoes and another from restaurant reviews in Yelp Dataset Challenge 2013. Extensive experimental results show that (1) the proposed method outperforms several strong baseline methods which only use textual semantics; (2) for the task of review rating prediction, matrix-vector multiplication is more effective to model user-word composition than vector concatenation or addition methods. The main contributions presented in this work are listed as follows: We represent user-word composition as matrix-vector multiplication, regarding each user as a matrix that modiﬁes the meaning of a certain word. To our knowledge, this is the ﬁrst neural network method that incorporates user information for review rating prediction. We report empirical results on two benchmark datasets. The proposed method performs better than strong baseline methods on the Yelp dataset.
Researcher Affiliation	Collaboration	Harbin Institute of Technology, Harbin, China Intelligent Computing and Search Lab, Tencent, Shenzhen, China EMAIL, EMAIL
Pseudocode	No	The paper describes the model and calculations in prose and equations, but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper links to word2vec (https://code.google.com/p/word2vec/), which is a third-party tool used by the authors, not their own source code for the proposed method. There is no explicit statement or link providing access to the authors' own implementation code.
Open Datasets	Yes	We conduct experiments on two benchmark datasets, Yelp13 and RT05. Yelp13 is a large-scale dataset consisting of restaurant reviews from Yelp. It is released by the third round of the Yelp Dataset Challenge in 2013. RT05 is a movie review dataset downloaded from Rotten Tomatoes. The statistical information of Yelp13 and RT05 are detailed in Table 1.
Dataset Splits	Yes	On Yelp13, we split the original corpus into train, dev and test sets with a 80:10:10 split. We train the rating predictor on the training set, tune parameters on the dev set and evaluate on the test set. On RT05, we use 10-fold cross-validation as in previous studies.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using "word2vec" to learn word vectors and "softmax" for prediction, and compares against baselines using "Supported Vector Machine (SVM)" and "Liblinear". However, no specific version numbers are provided for any of these software components or libraries.
Experiment Setup	Yes	We empirically set the vector dimension d as 100, the rank of user matrix r as 3. The values of W, b and u are randomly initialized with the fan-in trick. We use dropout [Srivastava et al., 2014] to avoid the neural network being over-ﬁtting. Hyper parameters are tuned on the development dataset.