reproducibilityindex.ai

Learning Temporal Point Processes via Reinforcement Learning

Authors: Shuang Li, Shuai Xiao, Shixiang Zhu, Nan Du, Yao Xie, Le Song

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted experiments on various synthetic and real sequences of event data and showed that our approach outperforms the state-of-the-art regarding both data description and computational efﬁciency. 6 Experiments We evaluate our algorithm by comparing with state-of-the-arts on both synthetic and real datasets.
Researcher Affiliation	Collaboration	Shuang Li 1, Shuai Xiao 2, Shixiang Zhu1, Nan Du3, Yao Xie1, and Le Song1,2 1Georgia Institute of Technology 2Ant Financial 3Google Brain
Pseudocode	Yes	Algorithm RLPP: Mini-batch Reinforcement Learning for Learning Point Processes
Open Source Code	No	The paper provides links to open source codes for baseline methods (WGANTPP and RMTPP) in footnotes (e.g., 'https://github.com/xiaoshuai09/Wasserstein-Learning-For-Point-Process' and 'https://github.com/dunan/Neural Point Process'), but does not provide specific access to the source code for their own proposed method, RLPP.
Open Datasets	Yes	Medical Information Mart for Intensive Care III (MIMIC-III) contains de-identiﬁed clinical visit records from 2001 to 2012 for more than 40,000 patients.
Dataset Splits	No	The paper does not explicitly provide specific dataset split information (e.g., percentages or sample counts for train/validation/test sets). It describes batch size for training but not data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments.
Software Dependencies	No	The paper mentions software like 'Tensorflow' and 'GPy package' but does not specify their version numbers.
Experiment Setup	Yes	The policy in our method RLPP is parameterized as LSTM with 64 hidden neurons, and π(a\|Θ(h)) is chosen to be exponential distribution. Batch size is 32 (the number of sampled sequences L and M are 32 in Algorithm 1, and learning rate is 1e-3. We use Gaussian kernel k(t, t ) = exp( t t 2/σ2) for the reward function. The kernel bandwidth σ is estimated using the median trick based on the observations [13].