reproducibilityindex.ai

Cold-start Active Learning with Robust Ordinal Matrix Factorization

Authors: Neil Houlsby, Jose Miguel Hernandez-Lobato, Zoubin Ghahramani

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our new model and active learning strategy on a diverse collection of rating datasets: i) Movie Lens100K and Movie Lens1M: two collections of ratings of movies; ii) Movie Tweets: movie ratings obtained from Twitter; iii) Webscope: ratings of songs; iv) Jester: ratings of jokes; v) Book: ratings of books; vi) Dating: ratings from an online dating website and vii) IPIP: ordinal responses to a psychometrics questionnaire. All the matrix entries in IPIP are observed, the other datasets have many missing entries. Descriptions, links to the data, and our pre-processing steps are in the supplementary material. We split the available ratings for each dataset randomly into a training and a test set with 80% and 20% of the ratings respectively. Each method was adjusted using the entries in the training set and then evaluated using predictive log-likelihood on the corresponding test set. The entire procedure, including dataset partitioning, was repeated 20 times. Table 1 contains the test log-likelihoods and Figure 2 summarizes the performance of each algorithm. The proposed model, HOMF, outperforms all the other methods in all datasets.
Researcher Affiliation	Academia	Neil Houlsby1 NMTH2@CAM.AC.UK Jos e Miguel Hern andez-Lobato1 JMH233@CAM.AC.UK Zoubin Ghahramani ZOUBIN@ENG.CAM.AC.UK University of Cambridge, Department of Engineering, Cambridge CB2 1PZ, UK
Pseudocode	No	The paper includes 'Figure 1. Graphical model for the robust method for ordinal matrix data as it is described in the main text.' which is a graphical model, not pseudocode or an algorithm block. No other pseudocode or algorithm blocks are provided.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for the methodology described.
Open Datasets	Yes	We evaluate our new model and active learning strategy on a diverse collection of rating datasets: i) Movie Lens100K and Movie Lens1M: two collections of ratings of movies; ii) Movie Tweets: movie ratings obtained from Twitter; iii) Webscope: ratings of songs; iv) Jester: ratings of jokes; v) Book: ratings of books; vi) Dating: ratings from an online dating website and vii) IPIP: ordinal responses to a psychometrics questionnaire. All the matrix entries in IPIP are observed, the other datasets have many missing entries. Descriptions, links to the data, and our pre-processing steps are in the supplementary material.
Dataset Splits	Yes	We split the available ratings for each dataset randomly into a training and a test set with 80% and 20% of the ratings respectively. We partitioned the data randomly into three sets: training, test and pool. For this, we sampled 75% of the users and added all of their ratings to the training set. These represented the ratings for the users that were already in the system. Each of the remaining 25% test users were initialized with a single item, adding that rating to the training set. For each test user we sampled three ratings and added these to the test set. The remaining ratings were added to the pool set. Figure 4 illustrates this set-up.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory specifications).
Software Dependencies	No	The paper does not provide specific version numbers for any ancillary software dependencies (e.g., Python version, library versions like PyTorch or TensorFlow, or specific solver versions).
Experiment Setup	Yes	For all models we ﬁx the latent dimension to h = 10. HOMF was adjusted using the available ratings in the training set. Then, during each iteration of active learning, a single rating was selected from the pool set for each test user using active learning. The selected ratings were then added to the training set and HOMF was incrementally readjusted using the new training set. We evaluated the second term in (6) using Monte Carlo sampling from Q with 100 samples. We evaluate the accuracy of three approximations: Monte Carlo (MC) sampling, the unscented approximation, and evaluating the integral with a delta function located at the mode of Q.