reproducibilityindex.ai

User-Dependent Neural Sequence Models for Continuous-Time Event Data

Authors: Alex Boyd, Robert Bamler, Stephan Mandt, Padhraic Smyth

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our methods on four large real-world datasets and demonstrate systematic improvements from our approach over existing work for a variety of predictive metrics such as log-likelihood, next event ranking, and source-of-sequence identiﬁcation.
Researcher Affiliation	Academia	Alex Boyd1 Robert Bamler2 Stephan Mandt1,2 Padhraic Smyth1,2 1Department of Statistics 2Department of Computer Science University of California, Irvine {alexjb, rbamler, mandt}@uci.edu smyth@ics.uci.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our source code for modeling and experiments can be found at the following repository: https://github. com/ajboyd2/vae_mpp.
Open Datasets	Yes	All models were trained and evaluated on four real-world datasets (see Table 1). The Meme Tracker dataset [Leskovec and Krevl, 2014]... The Reddit comments dataset [Baumgartner et al., 2020]... Amazon Reviews [Ni et al., 2019]... The 4th dataset, Last FM [Celma, 2010]
Dataset Splits	Yes	Training, validation, and test sets were split so that there were no users in common between them. ... Table 1: Statistics for the four datasets. Columns (left to right) are: ... total number of sequences and number of unique users in training/validation/test splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not explicitly state specific software dependencies or their version numbers required to replicate the experiments.
Experiment Setup	Yes	Models were trained by minimizing Eq. 7 and Eq. 8, averaged over training sequences, for the decoder-only and Mo E variants respectively via the Adam optimizer with default hyperparameters [Kingma and Ba, 2014] and a learning rate of 0.001. A linear warm-up schedule for the learning rate over the ﬁrst training epoch was used as it led to more stable training across runs. We also performed cyclical annealing on β in Eq. 8 from 0 to 0.001 with a period of 20% of an epoch to help prevent the posterior distribution from collapsing to the prior [Fu et al., 2019].