Transformer Embeddings of Irregularly Spaced Events and Their Participants
Authors: Hongyuan Mei, Chenghao Yang, Jason Eisner
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On several synthetic and real-world datasets, we evaluate our model s held-out log-likelihood, and its success at predicting the time and type of the next event. We compare with multiple strong competitors. |
| Researcher Affiliation | Collaboration | Chenghao Yang Dept. of Computer Science Columbia University yangalan1996@gmail.com Hongyuan Mei Toyota Tech. Institute at Chicago hongyuan@ttic.edu Jason Eisner Dept. of Computer Science Johns Hopkins University jason@cs.jhu.edu |
| Pseudocode | No | The paper refers to an algorithm from previous work ('Mei & Eisner (2017), Algorithm 1') but does not include any pseudocode or algorithm blocks within its own text. |
| Open Source Code | Yes | We release our code. ... Our code and datasets are available at https://github. com/yangalan123/anhp-andtt. |
| Open Datasets | Yes | We release our code. ... Our code and datasets are available at https://github. com/yangalan123/anhp-andtt. For MIMIC-II and Stack Overflow, we used the version processed by Du et al. (2016); more details (e.g., about processing) can be found in their paper. For Robo Cup, we used the version processed by Chen & Mooney (2008); please refer to their paper for more details (e.g., data description, processing method, etc) |
| Dataset Splits | Yes | Table 1: Statistics of each dataset. ... SYNTHETIC: TRAIN 59904, DEV 7425, TEST 7505. MIMIC-II: TRAIN 1930, DEV 252, TEST 237. STACKOVERFLOW: TRAIN 345116, DEV 38065, TEST 97233. ROBOCUP: TRAIN 2195, DEV 817, TEST 780. |
| Hardware Specification | Yes | For the experiments in section 7.1, we used the standalone Py Torch implementations for NHP and A-NHP, which are GPU-friendly. We trained each model on an NVIDIA K80 GPU. ... The machines we used for NDTT and A-NDTT are 6-core Haswell architectures. |
| Software Dependencies | Yes | We implemented our A-NDTT framework using Py Torch (Paszke et al., 2017) and py Datalog (Carbonell et al., 2016)... We made a considerable amount of modifications to their code (e.g., model, thinning algorithm), in order to migrate it to Py Torch 1.7. |
| Experiment Setup | Yes | We tuned these hyperparameters for each combination of model, dataset, and training size (e.g., each bar in Figures 2, 3a and 5), always choosing the combination of D and L that achieved the best performance on the dev set. Our search spaces were D {4, 8, 16, 32, 64, 128} and L {1, 2, 3, 4, 5, 6}. ... To train the parameters for a given model, we used the Adam algorithm (Kingma & Ba, 2015) with its default settings. We performed early stopping based on log-likelihood on the held-out dev set. |