reproducibilityindex.ai

Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning

Authors: Xiaoming Shi, Siqiao Xue, Kangrui Wang, Fan Zhou, James Zhang, Jun Zhou, Chenhao Tan, Hongyuan Mei

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments on several challenging real-world datasets, we demonstrate that our framework thanks to the reasoning capabilities of large language models could significantly outperform the state-of-the-art event sequence models.
Researcher Affiliation	Collaboration	1Ant Group 2UChicago 3TTIC {peter.sxm,siqiao.xsq,hanlian.zf,james.z,jun.zhoujun}@antgroup.com chenhao@uchicago.edu {kangrui,hongyuan}@ttic.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is at https://github.com/iLampard/lamp.
Open Datasets	Yes	GDELT (Leetaru & Schrodt, 2013). ICEWS (Boschee et al., 2015). Amazon Review (Jianmo Ni, 2019).
Dataset Splits	Yes	We split the dataset into disjoint train, dev, and test sets based on their dates: the 83100 events that happened before 2022-07-05 are training data; the 16650 events after 2022-07-19 are test data; the 9250 events between these dates are development data. (Example for GDELT, similar splits are provided for other datasets).
Hardware Specification	Yes	All the experiments were conducted on a server with 256G RAM, a 64 logical cores CPU (Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz) and one NVIDIA A100 GPU for acceleration.
Software Dependencies	No	All models are implemented using the Py Torch framework (Paszke et al., 2017). For the implementation of NHP, Att NHP and energy functions, we used the code from the public Github repository at https://github.com/ant-research/Easy Temporal Point Process (Xue et al., 2023) with Apache License 2.0. ... The paper mentions software and licenses but generally lacks specific version numbers for software dependencies beyond the PyTorch reference.
Experiment Setup	Yes	For each model, we did grid search and chose the hyperparameters based on their performance on the dev set; see Table 2 for the values of the hyperparameters. ... Training Ranking Model. The energy function used in our ranking model is the same as what s proposed in Xue et al. (2022), which consists a continuous-time Transformer and an MLP. The hyperparameters are tuned within a range of values that make the score function to have a similar size as the base ANHP model.