Collaborative Filtering with Localised Ranking

Authors: Charanpal Dhanjal, Romaric Gaudel, Stéphan Clémençon

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In addition, we provide computation results as to the efficacy of the proposed method using synthetic and real data. In this section we analyse various properties of MLAUC empirically as well as making comparisons between it and other user-item matrix factorisation methods including Soft Impute (Mazumder, Hastie, and Tibshirani 2010), Weighted Regularised Matrix Factorisation (WRMF, (Hu, Koren, and Volinsky 2008)) and BPR. The aim is to evaluate our ranking objective and optimisation algorithm relative to other loss functions when considering results at the very top of the list for each user.
Researcher Affiliation Academia Charanpal Dhanjal and St ephan Cl emenc on Institut Mines-T el ecom; T el ecom Paris Tech: CNRS LTCI F-75634 Paris C edex 13, France charanpal.dhanjal@telecom-paristech.fr stephan.clemencon@telecom-paristech.fr Romaric Gaudel LIFL: UMR University of Lille/CNRS 8022 & INRIA Lille Nord Europe F-59655 Villeneuve d Ascq C edex, France romaric.gaudel@inria.fr
Pseudocode No The paper describes algorithms verbally but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code No The paper does not provide any statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes Next we consider a set of real world datasets, Movie Lens, Flixster and Mendeley coauthors. For the Movie Lens and Flixster datasets, ratings are given on scales of 1 to 5 and those greater than 3 are considered relevant with the remaining ones set to zero. The Mendeley data is generated in the following manner: the raw data consists of authors and the documents stored in a matrix Y such that if ith author wrote jth document then Yij = 10, and if it is referenced in his/her library then Yij = 1. The rows of Y are normalised to have unit norm and an author-author matrix ˆX = YYT is computed. Finally, values are thresholded such that Xij = I(ˆXij > σ) where σ = 0.05 and we subsample 990 users randomly to form the dataset. For all datasets, we remove users with less than 10 items. Properties about the resulting matrices are shown in Table 3.
Dataset Splits Yes The performance at the top of the list using Synthetic2 as the optimisation proceeds is recorded using a validation set which is composed of 3 relevant items for each user. As an initial step in the learning process we perform model selection on the training nonzero elements using 3-fold cross validation with 3 relevant items taken from each user. We take a sample of 3 validation items from a proportion of 0.2 of the users to form this validation set.
Hardware Specification Yes The experiment is run on an Intel core i7-3740QM CPU with 8 cores and 16 GB of RAM and we record the objective value every 5 iterations.
Software Dependencies No The paper describes algorithms and methods (e.g., stochastic gradient descent, SVD) but does not specify any software libraries or their version numbers used for implementation.
Experiment Setup Yes The MLAUC algorithm is set up with k = 8 using s U = 30 row samples and s Y = 10 column samples to compute approximate derivatives for ui and vi. The initial values of U and V are found using an SVD of the ratings matrix, then we fix learning rate α = 0.1, regularisation λ = 0.1, maximum iterations T = 100, item distribution exponent β = 0.5 and positive item loss φ(x) = tanh(x).