Transformer Embeddings of Irregularly Spaced Events and Their Participants

Authors: Hongyuan Mei, Chenghao Yang, Jason Eisner

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On several synthetic and real-world datasets, we evaluate our model s held-out log-likelihood, and its success at predicting the time and type of the next event. We compare with multiple strong competitors.
Researcher Affiliation Collaboration Chenghao Yang Dept. of Computer Science Columbia University yangalan1996@gmail.com Hongyuan Mei Toyota Tech. Institute at Chicago hongyuan@ttic.edu Jason Eisner Dept. of Computer Science Johns Hopkins University jason@cs.jhu.edu
Pseudocode No The paper refers to an algorithm from previous work ('Mei & Eisner (2017), Algorithm 1') but does not include any pseudocode or algorithm blocks within its own text.
Open Source Code Yes We release our code. ... Our code and datasets are available at https://github. com/yangalan123/anhp-andtt.
Open Datasets Yes We release our code. ... Our code and datasets are available at https://github. com/yangalan123/anhp-andtt. For MIMIC-II and Stack Overflow, we used the version processed by Du et al. (2016); more details (e.g., about processing) can be found in their paper. For Robo Cup, we used the version processed by Chen & Mooney (2008); please refer to their paper for more details (e.g., data description, processing method, etc)
Dataset Splits Yes Table 1: Statistics of each dataset. ... SYNTHETIC: TRAIN 59904, DEV 7425, TEST 7505. MIMIC-II: TRAIN 1930, DEV 252, TEST 237. STACKOVERFLOW: TRAIN 345116, DEV 38065, TEST 97233. ROBOCUP: TRAIN 2195, DEV 817, TEST 780.
Hardware Specification Yes For the experiments in section 7.1, we used the standalone Py Torch implementations for NHP and A-NHP, which are GPU-friendly. We trained each model on an NVIDIA K80 GPU. ... The machines we used for NDTT and A-NDTT are 6-core Haswell architectures.
Software Dependencies Yes We implemented our A-NDTT framework using Py Torch (Paszke et al., 2017) and py Datalog (Carbonell et al., 2016)... We made a considerable amount of modifications to their code (e.g., model, thinning algorithm), in order to migrate it to Py Torch 1.7.
Experiment Setup Yes We tuned these hyperparameters for each combination of model, dataset, and training size (e.g., each bar in Figures 2, 3a and 5), always choosing the combination of D and L that achieved the best performance on the dev set. Our search spaces were D {4, 8, 16, 32, 64, 128} and L {1, 2, 3, 4, 5, 6}. ... To train the parameters for a given model, we used the Adam algorithm (Kingma & Ba, 2015) with its default settings. We performed early stopping based on log-likelihood on the held-out dev set.