Fighting Boredom in Recommender Systems with Linear Reinforcement Learning

Authors: Romain WARLOP, Alessandro Lazaric, Jérémie Mary

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we empirically validate the model assumptions and the algorithm in a number of realistic scenarios.
Researcher Affiliation Collaboration Romain Warlop fifty-five, Paris, France Seque L Team, Inria Lille, France romain@fifty-five.com Alessandro Lazaric Facebook AI Research Paris, France lazaric@fb.com Jérémie Mary Criteo AI Lab Paris, France j.mary@criteo.com
Pseudocode Yes Algorithm 1 The LINUCRL algorithm.
Open Source Code No The paper states that a dataset "will be released publicly as soon as possible" but makes no such statement regarding the source code for the methodology.
Open Datasets Yes In order to provide a preliminary validation of our model, we use the movielens-100k dataset [9, 7].
Dataset Splits No The paper describes using the movielens-100k dataset to estimate model parameters and construct a simulator, but does not specify explicit train/validation/test splits for the data.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments, such as CPU/GPU models or memory specifications.
Software Dependencies No The paper does not specify version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We choose K = 10 actions corresponding to different genres of movies, and we set d = 5 and w = 5, which results into Kw = 10^5 states. [...] The parameters that describe the dependency of the reward function on the recency (i.e., θ j,a) are computed by using the ratings averaged over all users for each state encountered and for ten different genres in the dataset. [...] Finally, the observed reward is obtained by adding a small random Gaussian noise to the linear function.