A Novel Sequence-to-Subgraph Framework for Diagnosis Classification
Authors: Jun Chen, Quan Yuan, Chao Lu, Haifeng Huang
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The evaluation conducted on both the real-world English and Chinese datasets shows that the proposed method outperforms the state-of-the-art deep learning based diagnosis classification models. |
| Researcher Affiliation | Industry | Jun Chen , Quan Yuan , Chao Lu and Haifeng Huang Baidu Inc, Beijing 100193, China {chenjun22, yuanquan02, luchao, huanghaifeng}@baidu.com |
| Pseudocode | No | The paper describes the model architecture and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | No explicit statement or link providing access to the open-source code for the proposed methodology (SHi DAN) was found. A link is provided for the LPA algorithm, which is a third-party tool used. |
| Open Datasets | Yes | MIMIC-III-50: A public English EMR dataset consisting of the Top-50 frequent diagnosis codes4 [Mullenbach et al., 2018]. Each EMR has one or more diagnosis codes. Thus, we use MIMIC-III-50 to evaluate the performance of the proposed method on multi-label classification. 4https://github.com/jamesmullenbach/caml-mimic |
| Dataset Splits | No | The paper does not explicitly state the training, validation, and test splits used for the datasets (e.g., percentages or sample counts for each split). While it mentions datasets like MIMIC-III-50 which has standard splits, it does not confirm if those specific splits were used here. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using existing packages like CliNER and an LPA algorithm, but it does not specify version numbers for any of the software dependencies or libraries used. |
| Experiment Setup | Yes | By default, the number of dimensions of word embeddings and entity embeddings is 100. The number of dimensions of latent feature m is 128. The dropout rate is empirically 0.2. On MIMIC-III-50, each model is trained 12 epochs with batch size 16, and the maximum number of subgraphs K = 15. On CHS-AD-200, each model is trained 35 epochs with batch size 64, and the maximum number of subgraphs K = 6. |