reproducibilityindex.ai

Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment

Authors: Chenxiao Yang, Qitian Wu, Qingsong Wen, Zhiqiang Zhou, Liang Sun, Junchi Yan

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on diverse tasks (e.g., sequential recommendation) demonstrate the effectiveness, applicability and scalability of our method with various off-the-shelf models as backbones. and We carry out comprehensive experiments on three sequential event prediction tasks with valuation protocols designed for testing model performance under temporal distribution shift.
Researcher Affiliation	Collaboration	1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2DAMO Academy, Alibaba Group
Pseudocode	No	The paper describes the methodology in text and uses figures to illustrate concepts and results, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code.
Open Source Code	Yes	The codes are available at https://github.com/chr26195/Caseq.
Open Datasets	Yes	We use four datasets with variable length and number of event types: Movielens, Yelp, Stack Overflow and ATM. The detailed description of datasets and their statistics are deferred to Appendix E. and citation [15] F Maxwell Harper and Joseph A Konstan. The movielens datasets: History and context. pages 1 19, 2015.
Dataset Splits	Yes	We use the last G + 1 events for testing, the first \|S\| G 2 events for training, and the (\|S\| G 1)-th event for validation.
Hardware Specification	No	The paper mentions 'limited computational resources' but does not provide any specific hardware details such as GPU/CPU models, memory, or types of computing clusters used for running experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'TensorFlow 2.x').
Experiment Setup	No	While the paper mentions aspects of the evaluation protocol (e.g., 'We set gap size g as 0, 30 for Movielens and 0, 20 for Yelp') and model architecture (e.g., 'our model uses the same number of layers as the baselines'), it does not provide specific training hyperparameters such as learning rate, batch size, optimizer settings, or explicit model initialization details in the main text.