Learning Temporal Point Processes via Reinforcement Learning
Authors: Shuang Li, Shuai Xiao, Shixiang Zhu, Nan Du, Yao Xie, Le Song
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted experiments on various synthetic and real sequences of event data and showed that our approach outperforms the state-of-the-art regarding both data description and computational efficiency. 6 Experiments We evaluate our algorithm by comparing with state-of-the-arts on both synthetic and real datasets. |
| Researcher Affiliation | Collaboration | Shuang Li 1, Shuai Xiao 2, Shixiang Zhu1, Nan Du3, Yao Xie1, and Le Song1,2 1Georgia Institute of Technology 2Ant Financial 3Google Brain |
| Pseudocode | Yes | Algorithm RLPP: Mini-batch Reinforcement Learning for Learning Point Processes |
| Open Source Code | No | The paper provides links to open source codes for baseline methods (WGANTPP and RMTPP) in footnotes (e.g., 'https://github.com/xiaoshuai09/Wasserstein-Learning-For-Point-Process' and 'https://github.com/dunan/Neural Point Process'), but does not provide specific access to the source code for their own proposed method, RLPP. |
| Open Datasets | Yes | Medical Information Mart for Intensive Care III (MIMIC-III) contains de-identified clinical visit records from 2001 to 2012 for more than 40,000 patients. |
| Dataset Splits | No | The paper does not explicitly provide specific dataset split information (e.g., percentages or sample counts for train/validation/test sets). It describes batch size for training but not data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments. |
| Software Dependencies | No | The paper mentions software like 'Tensorflow' and 'GPy package' but does not specify their version numbers. |
| Experiment Setup | Yes | The policy in our method RLPP is parameterized as LSTM with 64 hidden neurons, and π(a|Θ(h)) is chosen to be exponential distribution. Batch size is 32 (the number of sampled sequences L and M are 32 in Algorithm 1, and learning rate is 1e-3. We use Gaussian kernel k(t, t ) = exp( t t 2/σ2) for the reward function. The kernel bandwidth σ is estimated using the median trick based on the observations [13]. |