A General Offline Reinforcement Learning Framework for Interactive Recommendation
Authors: Teng Xiao, Donglin Wang4512-4520
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on two public realworld datasets, demonstrating that the proposed methods can achieve superior performance over existing supervised learning and reinforcement learning methods for recommendation. |
| Researcher Affiliation | Academia | Teng Xiao, Donglin Wang Machine Intelligence Lab (Mi LAB), AI Division, School of Engineering, Westlake University tengxiao01@gmail.com, wangdonglin@westlake.edu.cn |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any statement about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | We conduct extensive experiments on two public realworld datasets: Rec Sys 1: This dataset is a public dataset released by Rec Sys Challenge 2015 and contains sequences of user purchases and clicks. 1https://recsys.acm.org/recsys15/challenge/ Kaggle 2: This dataset comes from a real-world e-commerce website. 2https://www.kaggle.com/retailrocket/ecommerce-dataset |
| Dataset Splits | Yes | We randomly sample 80% sequences as the training set, 10% as validation and the rest as test set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not provide specific version numbers for ancillary software dependencies. |
| Experiment Setup | No | The paper mentions that “Hyperparameters are tuned on validation set” and that methods “are based on the same backbone i.e., recurrent neural networks (RNN)”, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings within the main text. |