Intensity-Free Learning of Temporal Point Processes
Authors: Oleksandr Shchur, Marin Biloš, Stephan Günnemann
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed models on the established task of event time prediction (with and without marks) in Sections 5.1 and 5.2. In the remaining experiments, we show how the log-normal mixture model can be used for incorporating extra conditional information, training with missing data and learning sequence embeddings. We use 6 real-world datasets containing event data from various domains: Wikipedia (article edits), MOOC (user interaction with online course system), Reddit (posts in social media) (Kumar et al., 2019), Stack Overflow (badges received by users), Last FM (music playback) (Du et al., 2016), and Yelp (check-ins to restaurants). We also generate 5 synthetic datasets (Poisson, Renewal, Self-correcting, Hawkes1, Hawkes2), as described in Omi et al. (2019). |
| Researcher Affiliation | Academia | Oleksandr Shchur , Marin Biloˇs , Stephan G unnemann Technical University of Munich, Germany {shchur,bilos,guennemann}@in.tum.de |
| Pseudocode | No | The paper describes procedures in text, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and datasets are available under https://github.com/shchur/ifl-tpp. |
| Open Datasets | Yes | We use 6 real-world datasets containing event data from various domains: Wikipedia (article edits), MOOC (user interaction with online course system), Reddit (posts in social media) (Kumar et al., 2019), Stack Overflow (badges received by users), Last FM (music playback) (Du et al., 2016), and Yelp (check-ins to restaurants). We also generate 5 synthetic datasets (Poisson, Renewal, Self-correcting, Hawkes1, Hawkes2), as described in Omi et al. (2019). |
| Dataset Splits | Yes | For each dataset, we use 60% of the sequences for training, 20% for validation and 20% for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments. |
| Software Dependencies | No | Our implementation uses Pytorch. (Paszke et al., 2017) |
| Experiment Setup | Yes | Optimization is performed using Adam (Kingma & Ba, 2015) with learning rate 10 3. We perform training using mini-batches of 64 sequences. We train for up to 2000 epochs (1 epoch = 1 full pass through all the training sequences). For all models, we compute the validation loss at every epoch. If there is no improvement for 100 epochs, we stop optimization and revert to the model parameters with the lowest validation loss. We select hyperparameter configuration for each model that achieves the lowest average loss on the validation set. For each model, we consider different values of L2 regularization strength C {0, 10 5, 10 3}. Additionally, for SOSFlow we tune the number of transformation layers M {1, 2, 3} and for DSFlow M {1, 2, 3, 5, 10}. More specifically, we set K = 64 for Log Norm Mix, DSFlow and Fully NN. For SOSFlow we used K = 4 and R = 3, resulting in a polynomial of degree 7 (per each layer). |