Concurrent Multi-Label Prediction in Event Streams
Authors: Xiao Shou, Tian Gao, Dharmashankar Subramanian, Debarun Bhattacharjya, Kristin P. Bennett
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the superior performance of our approach compared to existing baselines on multiple synthetic and real benchmarks. We conduct an extensive empirical investigation including ablation studies and demonstrate superior performance of our proposed model as compared to state-of-the-art baselines for next event multi-label prediction. |
| Researcher Affiliation | Collaboration | Xiao Shou1,2, Tian Gao3, Dharmashankar Subramanian3 Debarun Bhattacharjya3, Kristin P. Bennett1,2 1 Department of Mathematical Sciences, Rensselaer Polytechnic Institute 2 Department of Computer Science, Rensselaer Polytechnic Institute 3 Research AI, IBM T. J. Watson Research Center, Yorktown Heights, NY, USA |
| Pseudocode | No | The paper describes algorithms and procedures in prose and mathematical equations but does not include any clearly labeled pseudocode blocks or algorithm figures. |
| Open Source Code | Yes | Further details and codes are included in Appendix A in supplementary material. |
| Open Datasets | Yes | Synthea. This is a simulated EHR dataset that closely mimics real EHR data (Walonoski et al. 2018). Dunnhumby. We extract this dataset from Kaggle s Dunnhumby The Complete Journey dataset. MIMIC III. The MIMIC III database provides patientlevel de-identified health-related data associated with the Israel Deaconess Medical Center between 2001 and 2012 (Johnson, Pollard, and Mark 2016; Johnson et al. 2016; Goldberger et al. 2000). Defi. This dataset provides user-level cryptocurrency trading history under a specific protocol called Aave.5 The data includes timestamp, transaction type and coin type for each transaction. |
| Dataset Splits | Yes | We generate 5 simulations, each of which consists of a total of 1000 sequences and randomly split 60-20-20 training-dev-test subsets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions training with "Pytorch" but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | We implement and train our model with Pytorch and report results using 64 Bernoulli mixture components for all experiments. Hyper-parameter λ is chosen from {0.1, 0.01, 0.001} and is only used if domain knowledge is injected; otherwise it is set to 0. |