reproducibilityindex.ai

Pre-release Prediction of Crowd Opinion on Movies by Label Distribution Learning

Authors: Xin Geng, Peng Hou

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that LDSVR can accurately predict peoples s rating distribution about a movie just based on the pre-release metadata of the movie. 4 Experiments
Researcher Affiliation	Academia	Xin Geng and Peng Hou School of Computer Science and Engineering Southeast University, Nanjing, China {xgeng, hpeng}@seu.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	The data set used in the experiments includes 7, 755 movies and 54, 242, 292 ratings from 478, 656 different users. The ratings come from Netﬂix, which are on a scale from 1 to 5 integral stars. Each movie has, on average, 6, 994 ratings. The rating distribution is calculated for each movie as an indicator for the crowd opinion on that movie. The pre-release metadata are crawled from IMDb according to the unique movie IDs. Table 1 lists all the metadata included in the data set.
Dataset Splits	Yes	The algorithm parameters used in the experiments are empirically determined. The parameter selection process is nested into the 10-fold cross validation. In detail, the whole data set is ﬁrst randomly split into 10 chunks. Each time, one chunk is used as the test set, another is used as the validation set, and the rest 8 chunks are used as the training set. Then, the model is trained with different parameter settings on the training set and tested on the validation set. This procedure is repeated 10 folds, and the parameter setting with the best average performance is selected. After that, the original validation set is merged into the training set and the test set remains unchanged. The model is trained with the selected parameter setting on the updated training set and tested on the test set. This procedure is repeated 10 folds and the mean value and standard deviation of each evaluation measure is reported.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions various algorithms and methods (e.g., BFGS-LLD, IIS-LLD, AA-k NN, CPNN, RBF kernel) but does not provide specific version numbers for software libraries or dependencies.
Experiment Setup	Yes	All kernel based methods (LDSVR, S-SVR and M-SVRp) use the RBF kernel with the scaling factor σ equal to the average distance among the training examples. The penalty parameter C in Eq. (2) is set to 1. The insensitivity parameter ε is set to 0.1. All iterative algorithms terminate their iteration when the difference between adjacent steps is smaller than 10^-10. The number of neighbors k in AA-k NN is set to 10, and the number of hidden neurons in CPNN is set to 80.