Dynamic Memory based Attention Network for Sequential Recommendation

Authors: Qiaoyu Tan, Jianwei Zhang, Ninghao Liu, Xiao Huang, Hongxia Yang, Jingren Zhou, Xia Hu4384-4392

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results over four benchmark datasets demonstrate the superiority of our model in capturing long-term dependency over various state-of-the-art sequential models.
Researcher Affiliation Collaboration 1 Department of Computer Science and Engineering, Texas A&M University 2 Alibaba Group 3 The Hong Kong Polytechnic University
Pseudocode No The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide an unambiguous statement about releasing open-source code for the described methodology or a direct link to a code repository.
Open Datasets Yes We conduct experiments over four public benchmarks. Statistics of them are summarized in Table 2. Movie Lens 1 collects users rating scores for movies. JD.com (Lv et al. 2019) is a collection of user browsing logs over e-commerce products collected from JD.com. Taobao (Zhu et al. 2018) and XLong (Ren et al. 2019) are datasets of user behaviors from the commercial platform of Taobao. The behavior sequence in XLong is significantly longer than other three datasets, thus making it difficult to model. 1https://grouplens.org/datasets/movielens/1m/
Dataset Splits Yes Following the traditional way (Kang and Mc Auley 2018), we employ the last and second last interactions for testing and validation, respectively, and the remaining for training.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies No The paper mentions using 'TensorFlow' for implementation and 'Adam optimizer', but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes To enable a fair comparison, all methods are optimized with the number of samples equals 5 and the number of embedding dimensions D equals 128. We implement DMAN with Tensorflow and the Adam optimizer is utilized to optimize the model with learning rate equals 0.001. The batch size is set to 128 and the maximum epoch is 8. The number of memory slots m and attention layers L are searched from {2, 4, 6, 8, 10, 20, 30} and {1, 2, 3, 4, 5}, respectively. In experiments, we found 20 is enough for Movie Lens, Taobao, JD.com and XLong. From Figure 2(b), we observe that the number of attention layers has positive impacts in our model. To trade-off between memory costs and performance, we set L = 2 for all datasets since it already achieves satisfactory results.