reproducibilityindex.ai

Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Authors: Junying Chen, Dongfang Li, Qingcai Chen, Wenxiu Zhou, Xin Liu4432-4440

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three public datasets show that our model outperforms baselines on disease diagnosis by 1%, 6% and 11.5% with the highest training efficiency.
Researcher Affiliation	Academia	Harbin Institute of Technology (Shenzhen) Peng Cheng Laboratory
Pseudocode	No	The paper describes the methodology but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https: //github.com/jym Chen/Diaformer.
Open Datasets	Yes	We evaluate our model on three public automatic diagnosis datasets, namely Mu Zhi dataset (Wei et al. 2018), Dxy dataset (Xu et al. 2019) and Synthetic dataset (Liao et al. 2020).
Dataset Splits	No	For all model setting, the train set and test set both use the original format, as shown in Table 2. (Table 2 only lists # Training and # Test, no explicit # Validation).
Hardware Specification	Yes	Ttime indicates the training time to get the best diagnosis result running on a 1080Ti GPU
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	For all model setting, the train set and test set both use the original format, as shown in Table 2. All the experiment is carried by 5 times and the final result is the average of the best results on test set. Diaformer and its variants use small transformer networks (L=5, H=512, A=6). For training, the learning rate is 5e 5 and the batch size is 16. For inference, we set ρe as 0.9 and set ρp as 0.009 for Mu Zhi dataset, 0.012 for Dxy dataset and 0.01 for synthetic dataset.