ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Authors: Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai, Bo An
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on a benchmark dataset and a large-scale industrial dataset which consists of tens of millions of recommendation requests. Experimental results show that our method significantly outperforms the state-of-the-art baselines in various long-term engagement optimization tasks. |
| Researcher Affiliation | Collaboration | 1Nanyang Technological University, 2Kuaishou Technology, 3Hong Kong University of Science and Technology, 4Unaffiliated |
| Pseudocode | Yes | A formal description for Res Act algorithm is shown in Appendix A. ... Algorithm 1: Res Act-LEARNING ... Algorithm 2: Res Act-EXECUTION |
| Open Source Code | Yes | We describe the implementation details of Res Act in Appendix F, and also provide our source code and data in the supplementary material and external link. ... Data samples and codes can be found in https://www.dropbox.com/sh/btf0drgm99vmpfe/ AADtkm OLZPQ0s Tqms A0f0APna?dl=0. |
| Open Datasets | Yes | We conduct experiments on a synthetic dataset Movie Lens L-1m based on Movie Lens-1m (a popular benchmark for evaluating recommendation algorithms) and collected a large-scale industrial dataset Rec L-25m from a real-life streaming platform of short-form videos. ... Data samples and codes can be found in https://www.dropbox.com/sh/btf0drgm99vmpfe/ AADtkm OLZPQ0s Tqms A0f0APna?dl=0. |
| Dataset Splits | Yes | We sample the data of 5000 users as the training set, and use the data of the remaining users as the test set (with 50 users as the validation set). ... Among the 99,899 users, we randomly selected 80% of the users as the training set, of which 500 users were reserved for validation. The remaining 20% users constitute the test set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) are provided for running the experiments. |
| Software Dependencies | No | All methods are implemented with Py Torch. ... Optimizer Adam (Kingma & Ba, 2014). |
| Experiment Setup | Yes | We provide the hyper-parameters for Res Act in Table 5. ... Table 5: Hyper-parameters of Res Act. Hyper-parameter Value Optimizer Adam (Kingma & Ba, 2014) Actor Learning Rate 5 10 6 Critic Learning Rate 5 10 5 Batch Size 4096 Normalized Observations Ture Gradient Clipping False Discount Factor 0.9 Number of Behavior Estimators 20 Weight of LExp 5 10 2 Weight of LCon 5 10 1 Target Update Rate 1 10 2 Number of Epoch 5 |