Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning
Authors: Xiaoming Shi, Siqiao Xue, Kangrui Wang, Fan Zhou, James Zhang, Jun Zhou, Chenhao Tan, Hongyuan Mei
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments on several challenging real-world datasets, we demonstrate that our framework thanks to the reasoning capabilities of large language models could significantly outperform the state-of-the-art event sequence models. |
| Researcher Affiliation | Collaboration | 1Ant Group 2UChicago 3TTIC {peter.sxm,siqiao.xsq,hanlian.zf,james.z,jun.zhoujun}@antgroup.com chenhao@uchicago.edu {kangrui,hongyuan}@ttic.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is at https://github.com/iLampard/lamp. |
| Open Datasets | Yes | GDELT (Leetaru & Schrodt, 2013). ICEWS (Boschee et al., 2015). Amazon Review (Jianmo Ni, 2019). |
| Dataset Splits | Yes | We split the dataset into disjoint train, dev, and test sets based on their dates: the 83100 events that happened before 2022-07-05 are training data; the 16650 events after 2022-07-19 are test data; the 9250 events between these dates are development data. (Example for GDELT, similar splits are provided for other datasets). |
| Hardware Specification | Yes | All the experiments were conducted on a server with 256G RAM, a 64 logical cores CPU (Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz) and one NVIDIA A100 GPU for acceleration. |
| Software Dependencies | No | All models are implemented using the Py Torch framework (Paszke et al., 2017). For the implementation of NHP, Att NHP and energy functions, we used the code from the public Github repository at https://github.com/ant-research/Easy Temporal Point Process (Xue et al., 2023) with Apache License 2.0. ... The paper mentions software and licenses but generally lacks specific version numbers for software dependencies beyond the PyTorch reference. |
| Experiment Setup | Yes | For each model, we did grid search and chose the hyperparameters based on their performance on the dev set; see Table 2 for the values of the hyperparameters. ... Training Ranking Model. The energy function used in our ranking model is the same as what s proposed in Xue et al. (2022), which consists a continuous-time Transformer and an MLP. The hyperparameters are tuned within a range of values that make the score function to have a similar size as the base ANHP model. |