reproducibilityindex.ai

Collaborative Filtering with Localised Ranking

Authors: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In addition, we provide computation results as to the efﬁcacy of the proposed method using synthetic and real data. In this section we analyse various properties of MLAUC empirically as well as making comparisons between it and other user-item matrix factorisation methods including Soft Impute (Mazumder, Hastie, and Tibshirani 2010), Weighted Regularised Matrix Factorisation (WRMF, (Hu, Koren, and Volinsky 2008)) and BPR. The aim is to evaluate our ranking objective and optimisation algorithm relative to other loss functions when considering results at the very top of the list for each user.
Researcher Affiliation	Academia	Charanpal Dhanjal and St ephan Cl emenc on Institut Mines-T el ecom; T el ecom Paris Tech: CNRS LTCI F-75634 Paris C edex 13, France charanpal.dhanjal@telecom-paristech.fr stephan.clemencon@telecom-paristech.fr Romaric Gaudel LIFL: UMR University of Lille/CNRS 8022 & INRIA Lille Nord Europe F-59655 Villeneuve d Ascq C edex, France romaric.gaudel@inria.fr
Pseudocode	No	The paper describes algorithms verbally but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper does not provide any statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets	Yes	Next we consider a set of real world datasets, Movie Lens, Flixster and Mendeley coauthors. For the Movie Lens and Flixster datasets, ratings are given on scales of 1 to 5 and those greater than 3 are considered relevant with the remaining ones set to zero. The Mendeley data is generated in the following manner: the raw data consists of authors and the documents stored in a matrix Y such that if ith author wrote jth document then Yij = 10, and if it is referenced in his/her library then Yij = 1. The rows of Y are normalised to have unit norm and an author-author matrix ˆX = YYT is computed. Finally, values are thresholded such that Xij = I(ˆXij > σ) where σ = 0.05 and we subsample 990 users randomly to form the dataset. For all datasets, we remove users with less than 10 items. Properties about the resulting matrices are shown in Table 3.
Dataset Splits	Yes	The performance at the top of the list using Synthetic2 as the optimisation proceeds is recorded using a validation set which is composed of 3 relevant items for each user. As an initial step in the learning process we perform model selection on the training nonzero elements using 3-fold cross validation with 3 relevant items taken from each user. We take a sample of 3 validation items from a proportion of 0.2 of the users to form this validation set.
Hardware Specification	Yes	The experiment is run on an Intel core i7-3740QM CPU with 8 cores and 16 GB of RAM and we record the objective value every 5 iterations.
Software Dependencies	No	The paper describes algorithms and methods (e.g., stochastic gradient descent, SVD) but does not specify any software libraries or their version numbers used for implementation.
Experiment Setup	Yes	The MLAUC algorithm is set up with k = 8 using s U = 30 row samples and s Y = 10 column samples to compute approximate derivatives for ui and vi. The initial values of U and V are found using an SVD of the ratings matrix, then we ﬁx learning rate α = 0.1, regularisation λ = 0.1, maximum iterations T = 100, item distribution exponent β = 0.5 and positive item loss φ(x) = tanh(x).