Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Latent Logic Tree Extraction for Event Sequence Explanation from LLMs
Authors: Zitao Song, Chao Yang, Chaojie Wang, Bo An, Shuang Li
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical demonstrations showcase the promising performance and adaptability of our framework. Empirical results show that this method notably enhances generalization in event histories with semantic information. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Engineering, Nanyang Technological University, Singapore 2School of Data Science, The Chinese University of Hong Kong, Shenzhen, China 3Skywork AI, Singapore. |
| Pseudocode | Yes | Algorithm 1 Bayesian Logic Tree Learning for Events |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | MIMIC-III (Johnson et al., 2016), an electronic health record dataset from intensive care unit patients. ... EPIC-KITCHENS-100 (EPIC-100) (Damen et al., 2021), which documents everyday kitchen activities... Stack Overflow (SO) (Leskovec & Krevl, 2014), which records a sequence of reward history... |
| Dataset Splits | Yes | We consider each sequence as a record pertaining to a single individual and partition each dataset into 80%, 10%, 10% train/dev/test splits by the total population. |
| Hardware Specification | Yes | All the experiments were conducted on a server with 512G RAM, two 64 logical cores CPUS (AMD Ryzen Threadripper PRO 5995WX 64-Cores), and four NVIDIA RTX A6000 GPUs with 50G memory. |
| Software Dependencies | No | The paper states 'All models are implemented using the Py Torch framework.' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We present the selected hyperparameters on synthetic datasets and three real-world datasets in Table 6 and Table 7 respectively. These tables list specific values for 'EPOCHS', 'BATCH SIZE', 'LLM LR', 'LOGIC TREE DEPTH', 'LOGIC TREE WIDTH', and other training parameters. |