Explaining Point Processes by Learning Interpretable Temporal Logic Rules

Authors: Shuang Li, Mingquan Feng, Lu Wang, Abdelmajid Essofi, Yufeng Cao, Junchi Yan, Le Song

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate TELLER on synthetic and real event data. For real data, we consider healthcare treatment understanding and crime pattern learning (crime results are in Appendix H & I).
Researcher Affiliation Collaboration 1CUHK, Shenzhen 2Shanghai Jiao Tong University 3Microsoft Research 4MBZUAI 5Bio Map
Pseudocode Yes Algorithm 1: TELLER (RAFS); Algorithm 2: Sub Problem (SP)
Open Source Code Yes Code is available at https://github.com/FengMingquan-sjtu/Logic_Point_Processes_ICLR
Open Datasets Yes MIMIC-III is a dataset released under Physio Net Credentialed Health Data License 1.5.02. It was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). The requirement for individual patient consent was waived because all the patient health information was deidentified. We manually checked that this data do not contain personally identifiable information or offensive content.2https://physionet.org/content/mimiciii/view-license/1.4/
Dataset Splits No We extracted 4298 patient sequences, and randomly chose 80% of them for training and the remaining for testing.
Hardware Specification Yes Our model is trained and evaluated using 16 processes in parallel, on a server with a Xeon W-3175X CPU.
Software Dependencies No The paper mentions using "SGD type of algorithm" and "projected gradient descent" but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup Yes The learning rate in solving the restricted master problem is 10 4. The master problem is optimized by the SGD type of algorithm and we choose to use the projected gradient descent to take care of the weight constraints. The batch size is 64. Each time we solve the subproblem, we randomly selected 50% of the training data (i.e., patient sequences) to evaluate the subproblem objective.