On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation

Authors: Álvaro Labarca Silva, Denis Parra, Rodrigo Toro Icarte

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Table 1. Top-k recommendation performance comparison for buy interactions on the Retail Rocket dataset. Boldface denotes the highest performance. * denotes that the model outperforms the self-supervised baseline with a significance p-value < 0.05." and "Table 3 summarizes the results of a NIP evaluation protocol on purchase interactions in the Retail Rocket dataset.
Researcher Affiliation Academia 1Department of Computer Science, Pontificia Universidad Cat olica de Chile, Santiago, Chile 2National Center for Artifcial Intelligence (CENIA), Santiago, Chile 3Instituto Milenio en Ingenier ıa e Inteligencia Artificial para la Salud (i Health), Santiago, Chile 4Instituto Milenio Fundamentos de los Datos (IMFD), Santiago, Chile.
Pseudocode No The paper includes architectural diagrams (Figure 1) but does not contain any formal pseudocode blocks or algorithm listings.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a direct link to a code repository for the described methodology.
Open Datasets Yes The Retail Rocket dataset contains sequential data collected from a real-world e-commerce website. We followed the setting by Xin et al. (2020)..." and "We also tested our models on the RC15 dataset. This is another session-based dataset constructed from retailer session data. It was proposed as part of the Rec Sys Challenge 2015.
Dataset Splits Yes We implement an 8:1:1 split ratio for train, validation, and test by using the first 80% of sessions for the train set and the last 10% for the test set.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions optimizers like Adam and uses specific models (GRU, Caser, NIN, SASRec) but does not provide specific version numbers for software libraries or dependencies (e.g., 'PyTorch 1.9', 'TensorFlow 2.x').
Experiment Setup Yes For SQN, we used the same hyperparameters as the original implementation (Xin et al., 2020) and trained the models for 50 epochs. Each experiment was run 5 times, with the average performance reported. Table 5 in the Appendix shows the implementation details for each model." and Table 5 lists specific values for "Epochs", "Batch", "lr", "γ", "h factor", "filter#", "f sizes", "Head#", "dropout", "CR", "BR".