Interacting Diffusion Processes for Event Sequence Forecasting
Authors: Mai Zeng, Florence Regol, Mark Coates
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPPs. Our approach significantly outperforms existing baselines for long-term forecasting, while also improving efficiency. Our experimental analysis provides insights into how the model achieves this: it can capture more complex correlation structures and is better at predicting distant events. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, Mc Gill University, Montreal QC, Canada 2International Laboratory on Learning Systems (ILLS), Montreal, QC, Canada 3Mila Québec AI Institute, Montreal, QC, Canada. |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block (e.g., 'Algorithm', 'Pseudocode') found in the paper. |
| Open Source Code | Yes | The code and implementation are available at our official repository |
| Open Datasets | Yes | We use six real-world datasets. Taobao (Zhu et al., 2018) tracks user clicks made on a website; Taxi (Whong, 2014) contains trips to neighborhoods by taxi drivers; Stack Overflow (Leskovec & Krevl, 2014) tracks the history of posts on stackoverflow; Retweet (Zhou et al., 2013) tracks user interactions on social media; MOOC (Kumar et al., 2019) tracks user interactions within an online course system; and Amazon (Ni et al., 2019) tracks the sequence of product categories reviewed by a group of users. |
| Dataset Splits | Yes | We follow Xue et al. (2022) for the train/val/test splits, which we report in Appendix A.4, together with additional dataset details. The disjoint train, validation and test sets consist of 1300, 200, and 500 sequences (users), respectively, randomly sampled from the dataset. (example for Taobao dataset). |
| Hardware Specification | Yes | The experiments were run on a Ge Force RTX 2070 SUPER machine. |
| Software Dependencies | Yes | For the two diffusion denoising functions ϵθ( ), ϕθ( ), we use the Py Torch built-in transformer block (Paszke et al., 2019). We use the Box-cox transformation function from the Sci Py package provided by Virtanen et al. (2020). |
| Experiment Setup | Yes | Hyperparameter selection uses the Tree-Structured Parzen Estimator hyperparameter search algorithm from Bergstra et al. (2011). We train for a maximum of 500 epochs and we select the best hyperparameters using the Tree-Structured Parzen Estimator (Bergstra et al., 2011). Table 6 specifies the hyperparameters that we use for our experiments and the candidate values. |