Graph-Driven Generative Models for Heterogeneous Multi-Task Learning
Authors: Wenlin Wang, Hongteng Xu, Zhe Gan, Bai Li, Guoyin Wang, Liqun Chen, Qian Yang, Wenqi Wang, Lawrence Carin979-988
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our method (GD-VAE) on the MIMIC-III dataset (Johnson et al. 2016)... Experimental results show that the jointly learned representation for the admission graph indeed improves the performance of all tasks relative to the individual task model. |
| Researcher Affiliation | Collaboration | Wenlin Wang,1 Hongteng Xu,2 Zhe Gan,3 Bai Li,1 Guoyin Wang,1 Liqun Chen,1 Qian Yang,1 Wenqi Wang,4 Lawrence Carin1 1Duke University, 2Infinia ML, Inc, 3Microsoft Dynamics 365 AI Research, 4Facebook, Inc |
| Pseudocode | No | No pseudocode or algorithm blocks explicitly labeled as such were found. |
| Open Source Code | No | No statement or link regarding the provision of open-source code for the described methodology was found. |
| Open Datasets | Yes | We test our method (GD-VAE) on the MIMIC-III dataset (Johnson et al. 2016), which contains more than 58,000 hospital admissions with 14,567 disease ICD codes and 3,882 procedures ICD codes. |
| Dataset Splits | Yes | In each trial, we split the data into train, validation and test sets with a ratio of 0.6, 0.2 and 0.2, respectively. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, or cloud computing instance specifications) are mentioned for running the experiments. |
| Software Dependencies | No | The paper mentions software components like 'PyTorch' in related work, but does not specify version numbers for any software dependencies used for replication. |
| Experiment Setup | Yes | For the network architecture, we fix the embedding space to be 200 for ICD codes and admissions, and a two-layer GCN (Kipf and Welling 2017) with residual connection is considered for the inference network. Concerning the dimension of latent variable, z T is identical to the number of topics for topic modeling and 200 for the other two tasks, z R and z P . For the generative network, a linear layer is employed for both topic modeling and admission-type prediction. For the procedure recommendation, a one-hidden-layer MLP with tanh as the nonlinear activation function is used. Concerning the hyperparameters, we merge 10 randomly sampled admissions to generate a topic admission for our NBTM, such that y T is not too sparse, and 5,000 samples are generated so as to train the model. Following (Srivastava and Sutton 2017), the prior α is a vector with constant value 0.02. |