ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

Authors: Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai, Bo An

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on a benchmark dataset and a large-scale industrial dataset which consists of tens of millions of recommendation requests. Experimental results show that our method significantly outperforms the state-of-the-art baselines in various long-term engagement optimization tasks.
Researcher Affiliation Collaboration 1Nanyang Technological University, 2Kuaishou Technology, 3Hong Kong University of Science and Technology, 4Unaffiliated
Pseudocode Yes A formal description for Res Act algorithm is shown in Appendix A. ... Algorithm 1: Res Act-LEARNING ... Algorithm 2: Res Act-EXECUTION
Open Source Code Yes We describe the implementation details of Res Act in Appendix F, and also provide our source code and data in the supplementary material and external link. ... Data samples and codes can be found in https://www.dropbox.com/sh/btf0drgm99vmpfe/ AADtkm OLZPQ0s Tqms A0f0APna?dl=0.
Open Datasets Yes We conduct experiments on a synthetic dataset Movie Lens L-1m based on Movie Lens-1m (a popular benchmark for evaluating recommendation algorithms) and collected a large-scale industrial dataset Rec L-25m from a real-life streaming platform of short-form videos. ... Data samples and codes can be found in https://www.dropbox.com/sh/btf0drgm99vmpfe/ AADtkm OLZPQ0s Tqms A0f0APna?dl=0.
Dataset Splits Yes We sample the data of 5000 users as the training set, and use the data of the remaining users as the test set (with 50 users as the validation set). ... Among the 99,899 users, we randomly selected 80% of the users as the training set, of which 500 users were reserved for validation. The remaining 20% users constitute the test set.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory) are provided for running the experiments.
Software Dependencies No All methods are implemented with Py Torch. ... Optimizer Adam (Kingma & Ba, 2014).
Experiment Setup Yes We provide the hyper-parameters for Res Act in Table 5. ... Table 5: Hyper-parameters of Res Act. Hyper-parameter Value Optimizer Adam (Kingma & Ba, 2014) Actor Learning Rate 5 10 6 Critic Learning Rate 5 10 5 Batch Size 4096 Normalized Observations Ture Gradient Clipping False Discount Factor 0.9 Number of Behavior Estimators 20 Weight of LExp 5 10 2 Weight of LCon 5 10 1 Target Update Rate 1 10 2 Number of Epoch 5