ERMMA: Expected Risk Minimization for Matrix Approximation-based Recommender Systems

Authors: DongSheng Li, Chao Chen, Qin Lv, Li Shang, Stephen Chu, Hongyuan Zha

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Movie Lens and Netflix datasets demonstrate that ERMMA outperforms six state-of-the-art MA-based recommendation methods in both rating prediction problem and item ranking problem.
Researcher Affiliation Collaboration 1IBM Research China, Shanghai, P.R. China, 201203 2Univeristy of Colorado Boulder, Boulder, Colorado, USA, 80309 3Georgia Institute of Technology, Atlanta, Georgia, USA, 30332
Pseudocode No The paper describes iterative optimization methods (e.g., SGD update rules in Equations 7 and 8) but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a link for the SMA implementation ('https://github.com/ldscc/Stable MA.git') which is a baseline method, but no link or explicit statement about open-sourcing the code for ERMMA.
Open Datasets Yes Three popular datasets are adopted in the experiments: Movie Lens 1M dataset (6,040 users, 3,706 items, 106 ratings). Movie Lens 10M ( 70k users, 10k items, 107 ratings) and Netflix ( 480k users, 18k items, 108 ratings).
Dataset Splits No The paper mentions splitting data into 'training and test sets' but does not explicitly state a separate 'validation' split or its proportions.
Hardware Specification No The paper mentions 'memory limitation of our server on Netflix' but does not specify any exact hardware details such as CPU, GPU models, or memory capacity.
Software Dependencies No The paper discusses various algorithms and methods but does not specify software dependencies with version numbers (e.g., programming language versions, library versions, or specific solver versions).
Experiment Setup Yes For ERMMA, we consider all the options including s and λ, and use learning rate v = 0.001 for stochastic gradient decent, μ = 0.06 for regularization coefficient, ϵ = 0.0001 for gradient descent convergence threshold, and T = 250 for maximum number of iterations. For RSVD, BPMF, we use the same parameter values provided in the original papers (Paterek 2007; Salakhutdinov and Mnih 2008; Li et al. 2016). For SMA, all parameters were set to default values in their implementation 1. For GSMF, we select α = 1.0, β = 70, λ = 0.05 and rank r = 20. For LLORMA, we choose learning rate v = 0.001 and regularization coefficient μ = 0.01, and the number of local models z = 50. This is a slight modification of the original LLORMA experimental setup, where better performance can be achieved. For WEMAREC, we adopt learning rate v = 0.002 and regularization coefficient μ = 0.01, and default values provided in the source code for unstated parameters (such as the ensemble weights and the maximum number of iterations in clustering).