An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention
Authors: Yehjin Shin, Jeongwhan Choi, Hyowon Wi, Noseong Park
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our proposed approach through extensive experiments on 6 benchmark datasets. The experimental results demonstrate that our model outperforms 7 baseline methods in terms of recommendation performance. |
| Researcher Affiliation | Academia | Yonsei University, Seoul, South Korea {yehjin.shin, jeongwhan.choi, wihyowon, noseong}@yonsei.ac.kr |
| Pseudocode | No | The paper describes its proposed model architecture and process in text and through a diagram (Figure 4), but it does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/yehjin-shin/BSARec. |
| Open Datasets | Yes | Datasets We evaluate our model on 6 SR datasets where the sparsity and domain varies: i,ii,iii) Amazon Beauty, Sports, Toys (Mc Auley et al. 2015), iv) Yelp, v) ML1M (Harper and Konstan 2015), and vi) Last FM. |
| Dataset Splits | No | The paper defines how items are selected for next-item prediction but does not provide specific details on the train, validation, and test dataset splits (e.g., percentages or sample counts) within the main text. It mentions data pre-processing and refers to an Appendix for best hyperparameters, implying typical splits are used, but they are not explicitly defined here. |
| Hardware Specification | Yes | Our method is implemented in Py Torch on an NVIDIA RTX 3090 with 16 GB memory. |
| Software Dependencies | No | The paper states that the method is "implemented in Py Torch" but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We conduct experiments under the following hyperparameters: the coefficient α is in t0.1, 0.3, 0.5, 0.7, 0.9u, and c is chosen from t1, 3, 5, 7, 9u. The number of BSA blocks L is set to 2, and the number of heads in Transformer h is in t1, 2, 4u. The dimension of D is set to 64, and the maximum sequence length N is set to 50. For training, the Adam optimizer is optimized with a learning rate in {5 ˆ 10 4, 1 ˆ 10 3}, and the batch size is set to 256. |