Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Decomposable Transformer Point Processes
Authors: Aristeidis Panos
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We considered two different tasks to assess the predictive performance of our proposed method: Goodness-of-fit/next-event prediction and long-horizon prediction. We compared our method DTPP to several strong baselines over five real-world datasets and three synthetic ones. |
| Researcher Affiliation | Academia | Aristeidis Panos University of Cambridge EMAIL |
| Pseudocode | Yes | Algorithm 1 Long-Horizon Prediction for Decomposed Transformer Point Processes; Algorithm 2 Thinning Algorithm |
| Open Source Code | Yes | Our framework was implemented with Py Torch [31] and scikit-learn [32]; the code is available at https://github.com/aresPanos/dtpp. |
| Open Datasets | Yes | We fit the above six models on a diverse collection of five popular real-world datasets, each with varied characteristics: MIMIC-II [19], Amazon [28], Taxi [37], Taobao [43], and Stack Overlfow V1 [20, 41]. |
| Dataset Splits | Yes | We use 200 epochs in total, a batch size of 8 sequences, and we apply early-stopping based on the log-likelihood of the held-out dev set. |
| Hardware Specification | No | Section A.2 'Training Details' mentions: 'All experiments were carried out on the same Linux machine with a dedicated reserved GPU used for acceleration.' This description is too general and lacks specific details such as the GPU model, CPU type, or memory, which are necessary for hardware reproducibility. |
| Software Dependencies | No | The paper mentions 'Our framework was implemented with Py Torch [31] and scikit-learn [32]' and also lists other repositories for baselines (e.g., 'https://github.com/yangalan123/anhp-andtt'). However, it does not provide specific version numbers for these software dependencies, which is required for reproducible software descriptions. |
| Experiment Setup | Yes | We use the Adam optimizer [18] with its default settings to train all the models in Section 5. We use 200 epochs in total, a batch size of 8 sequences, and we apply early-stopping based on the log-likelihood of the held-out dev set. ... The hyperparameters D and L were fine-tuned for each combination of dataset and model. We gridsearch the two parameters using the search spaces D {4, 8, 16, 32, 64, 128} and L {1, 2, 3, 4, 5}. |