DropoutNet: Addressing Cold Start in Recommender Systems

Authors: Maksims Volkovs, Guangwei Yu, Tomi Poutanen

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically we demonstrate state-of-the-art accuracy on publicly available benchmarks. To validate the proposed approach, we conducted extensive experiments on two publicly available datasets: Cite ULike [21] and the ACM Rec Sys 2017 challenge dataset [2]. Warm and cold start recall@100 results are shown in Table 1.
Researcher Affiliation Industry Maksims Volkovs layer6.ai maks@layer6.ai Guangwei Yu layer6.ai guang@layer6.ai Tomi Poutanen layer6.ai tomi@layer6.ai
Pseudocode Yes Algorithm 1: Learning Algorithm
Open Source Code Yes Code is available at https://github.com/layer6ai-labs/Dropout Net.
Open Datasets Yes To validate the proposed approach, we conducted extensive experiments on two publicly available datasets: Cite ULike [21] and the ACM Rec Sys 2017 challenge dataset [2]. [21] refers to 'C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In Conference on Knowledge Discovery and Data Mining, 2011.' and [2] refers to 'F. Abel, Y. Deldjo, M. Elahi, and D. Kohlsdorf. Recsys challenge 2017. http://2017. recsyschallenge.com, 2017.'
Dataset Splits No The paper details training and test splits but does not explicitly mention a dedicated validation set or its size/proportion for hyperparameter tuning. For evaluation, it mentions 'Fold 1 from [21]' for Cite ULike, which implies pre-defined splits, but no specific 'validation' split for their own experiments is detailed.
Hardware Specification Yes All experiments were conducted on a server with 20-core Intel Xeon CPU E5-2630 CPU, Nvidia Titan X GPU and 128GB of RAM.
Software Dependencies No The paper mentions using 'Tensor Flow library [1]' but does not specify a version number for TensorFlow or any other software dependencies.
Experiment Setup Yes All DNN models are trained with mini batches of size 100, fixed learning rate and momentum of 0.9. Using τ to denote the dropout rate, for each batch we randomly select τ batch size users and items. For our model we found that 1-hidden layer architectures with 500 hidden units and tanh activations gave good performance. We follow the approach of [6] and use a pyramid structure where the network gradually compresses the input witch each successive layer. For all architecture we use fully connected layers with batch norm [14] and tanh activation functions; other activation functions such as Re LU and sigmoid produced significantly worse results. We use the three layer model in all experiments.