Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment

Authors: Chenxiao Yang, Qitian Wu, Qingsong Wen, Zhiqiang Zhou, Liang Sun, Junchi Yan

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on diverse tasks (e.g., sequential recommendation) demonstrate the effectiveness, applicability and scalability of our method with various off-the-shelf models as backbones. and We carry out comprehensive experiments on three sequential event prediction tasks with valuation protocols designed for testing model performance under temporal distribution shift.
Researcher Affiliation Collaboration 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2DAMO Academy, Alibaba Group
Pseudocode No The paper describes the methodology in text and uses figures to illustrate concepts and results, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code.
Open Source Code Yes The codes are available at https://github.com/chr26195/Caseq.
Open Datasets Yes We use four datasets with variable length and number of event types: Movielens, Yelp, Stack Overflow and ATM. The detailed description of datasets and their statistics are deferred to Appendix E. and citation [15] F Maxwell Harper and Joseph A Konstan. The movielens datasets: History and context. pages 1 19, 2015.
Dataset Splits Yes We use the last G + 1 events for testing, the first |S| G 2 events for training, and the (|S| G 1)-th event for validation.
Hardware Specification No The paper mentions 'limited computational resources' but does not provide any specific hardware details such as GPU/CPU models, memory, or types of computing clusters used for running experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'TensorFlow 2.x').
Experiment Setup No While the paper mentions aspects of the evaluation protocol (e.g., 'We set gap size g as 0, 30 for Movielens and 0, 20 for Yelp') and model architecture (e.g., 'our model uses the same number of layers as the baselines'), it does not provide specific training hyperparameters such as learning rate, batch size, optimizer settings, or explicit model initialization details in the main text.