Efficient Non-parametric Bayesian Hawkes Processes
Authors: Rui Zhang, Christian Walder, Marian-Andrei Rizoiu, Lexing Xie
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On synthetic data, we show our methods to be able to infer flexible Hawkes triggering kernels. On two largescale Twitter diffusion datasets, we show that our methods outperform the current state-of-the-art in goodness-of-fit and that the time complexity is linear in the size of the dataset. We now evaluate our proposed approaches Gibbs-Hawkes and EM-Hawkes and compare them to three baseline models, on synthetic data and on two large Twitter online diffusion datasets. |
| Researcher Affiliation | Collaboration | Rui Zhang1,2 , Christian Walder1,2 , Marian-Andrei Rizoiu3 and Lexing Xie1,2 1The Australian National University, Australia 2Data61 CSIRO, Australia 3University of Technology Sydney, Australia |
| Pseudocode | No | The paper includes Figure 2, which is a 'visual summary of the Gibbs-Hawkes, EM-Hawkes and the EM algorithms', but it is a conceptual diagram and not a structured pseudocode or algorithm block. |
| Open Source Code | No | No explicit statement providing open-source code for the methodology or a direct link to a code repository was found. The paper mentions that 'Codes of ODE based and WH based methods are publicly available [Bacry et al., 2017]', but this refers to baselines, not the authors' own method. |
| Open Datasets | Yes | ACTIVE [Rizoiu et al., 2018] contains 41k retweet cascades, each containing at least 20 (re)tweets with links to Youtube videos. SEISMIC [Zhao et al., 2015] contains 166k randomly sampled retweet cascades, collected in from Oct 7 to Nov 7, 2011. |
| Dataset Splits | Yes | Each toy model generates 400 point sequences over Ω r0, πs, which are evenly split into 40 groups, 20 for training and 20 for test. The temporal extent of each cascade is scaled to r0, πs, and assigned to either training or test data with equal probability. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were provided in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers were listed. The paper mentions 'tick: a python library for statistical learning' in reference [Bacry et al., 2017] but does not state versions for its own implementation. |
| Experiment Setup | Yes | For Gibbs-Hawkes and EM-Hawkes, we must select parameters of the GP kernel (Eqs. (12) to (14)). Having many basis functions leads to a high fitting accuracy, but low speed. We found that using 32 basis functions provides a suitable balance. For kernel parameters a, b of Eq. (13), we choose a, b 0.002. 5000 iterations are run to fit each group and first 1000 are ignored (i.e. burned-in). |