Probabilistic Masked Attention Networks for Explainable Sequential Recommendation

Authors: Huiyuan Chen, Kaixiong Zhou, Zhimeng Jiang, Chin-Chia Michael Yeh, Xiaoting Li, Menghai Pan, Yan Zheng, Xia Hu, Hao Yang

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental studies on real-world benchmark datasets show that PMAN is able to improve the performance of Transformers significantly. 5 Experiment 5.1 Experimental Setup Dataset. We consider five benchmark datasets: Amazon Beauty, Amazon-Sports2, Yelp3, Movie Lens1M4, and Steam5.
Researcher Affiliation Collaboration 1Visa Research 2Rice University 3Texas A&M University
Pseudocode Yes Algorithm 1 PMAN Input: The training sequence set S, attention capacity B, embedding size d.
Open Source Code No The paper does not provide an explicit statement or link for the open-source code of the described methodology.
Open Datasets Yes Dataset. We consider five benchmark datasets: Amazon Beauty, Amazon-Sports2, Yelp3, Movie Lens1M4, and Steam5. For each dataset, we group the interactions by users, and sort their items by the timestamps ascendingly. Following [Fan et al., 2022], we adopt 5-core setting to filter out unpopular items and inactive users with fewer than five interaction records. Their statistics are listed in Table 1.
Dataset Splits Yes Following the procedure [Kang and Mc Auley, 2018; Li et al., 2020; Fan et al., 2022], we use the last item of each user s sequence for testing, the second-to-last for validation, and the remaining items for training.
Hardware Specification No The paper mentions 'with the same hardware' but does not provide specific details about the hardware used for experiments (e.g., CPU, GPU model, memory).
Software Dependencies No The paper mentions using 'Adam as optimizer' but does not specify version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes The parameters for the baselines are initialized as their original settings and are then carefully tuned to obtain optimal performance. We adopt Adam as optimizer and search embedding dimension d in Eq. (2) within {32, 64, 128}, the length of item sequence n within {25, 50}. For the attention capacity B in Problem (8), we vary the ratio r in {0.3, 0.5, 0.7, 0.9}, such that B = r n2. Moreover, all of our PMANs only use single-head attention in the experiments.