Hierarchical Reinforcement Learning for Point of Interest Recommendation
Authors: Yanan Xiao, Lu Jiang, Kunpeng Liu, Yuanbo Xu, Pengyang Wang, Minghao Yin
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through evaluations of multiple realworld datasets, we have demonstrated that HRLPRP surpasses existing state-of-the-art methods in various recommendation performance metrics. |
| Researcher Affiliation | Academia | 1School of Computer Science and Information Technology, Northeast Normal University, China 2Department of Information Science and Technology, Dalian Maritime University, China 3Department of Computer Science, Portland State University, USA 4College of Computer Science and Technology, Jilin University, China 5Department of Computer and Information Science, University of Macau, China 6Key Laboratory of Applied Statistics of MOE, Northeast Normal University, China 7Mobile Intelligent Computing (MIC) Lab, Jilin University, China 8The State Key Laboratory of Internet of Things for Smart City, University of Macau, China |
| Pseudocode | Yes | The training process is shown in Algorithm 1 |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | In our study, we additionally included other datasets from Tokyo, Brightkite, Instagram, and Gowalla, which are widely used benchmarks in POI recommendation studies. |
| Dataset Splits | Yes | During the training phase, the last POI of the sequence is set as the target, and the rest constitutes the historical context. When generating negative samples, the target POI is replaced by four random POIs. In the testing phase, each check-in in the test set is considered as a target event and 99 random negative instances are paired to fully evaluate the model performance. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions various models (e.g., RNN, LSTM) and mechanisms (e.g., NAIS) but does not provide specific version numbers for any software libraries or dependencies used for implementation. |
| Experiment Setup | Yes | For the profile reviser, sampling time M is set as 3, and the learning rate is set as 0.001/0.0005 at the pre-training and joint-training stages respectively. In the policy function, the dimensions of the hidden layer dl 2 and dh 2 are both set as 8. For the basic recommender, the dimension of the POI embeddings is set to 128, the learning rate is 0.001 at both the pre-training and joint-training stages, and the size of the minibatch is 128. The delayed coefficient λ for the joint training is 0.0005. |