Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point Processes
Authors: Chao Qu, Xiaoyu Tan, Siqiao Xue, Xiaoming Shi, James Zhang, Hongyuan Mei
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our algorithm SEDRL comprehensively in the synthetic simulation and experiments with real data such as smart broadcasting and improving the engagement of the social media platform. |
| Researcher Affiliation | Collaboration | 1 Ant Group, Hangzhou, China. 2 Toyota Technological Institute at Chicago, Chicago, IL, United States. |
| Pseudocode | Yes | The pseducode of the algorithm is deferred to Appendix. |
| Open Source Code | Yes | We release our code at https://github.com/William BUG/Event driven rl/tree/main |
| Open Datasets | Yes | In particular, we use Retweet dataset (Zhao et al. 2015) to learn a model of follower. We use the data gathered from Stack-Overflow (Leskovec and Krevl 2014) to learn a feedback model of the users. |
| Dataset Splits | No | The paper describes the datasets used and how some parameters are set for the simulation environment, but it does not provide specific train/validation/test dataset split percentages or counts for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cloud computing resources) used for running the experiments. |
| Software Dependencies | No | The paper mentions software tools and libraries like PyTorch and model architectures such as LSTM and Transformer, but it does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Other details such as the hyperparameter tunnings in SEDRL and baselines are in Appendix. |