Directed Acyclic Graph Neural Networks
Authors: Veronika Thost, Jie Chen
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform comprehensive experiments, including ablation studies, on representative DAG datasets (i.e., source code, neural architectures, and probabilistic graphical models) and demonstrate the superiority of DAGNN over simpler DAG architectures as well as general graph architectures. |
| Researcher Affiliation | Collaboration | Veronika Thost & Jie Chen MIT-IBM Watson AI Lab, IBM Research Veronika.Thost@ibm.com, chenjie@us.ibm.com |
| Pseudocode | No | The paper describes procedures and model equations but does not contain clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Supported code is available at https://github.com/vthost/DAGNN. |
| Open Datasets | Yes | The OGBG-CODE dataset (Hu et al., 2020) contains 452,741 Python functions parsed into DAGs. [...] The NA dataset (Zhang et al., 2019) contains 19,020 neural architectures [...] The BN dataset (Zhang et al., 2019) contains 200,000 Bayesian networks... |
| Dataset Splits | Yes | We adopt OGB s project split, whose training set consists of Github projects not seen in the validation and test sets. [...] For NA and BN, we adopted the given 90/10 splits. [...] We used 5-fold cross validation due to the size of the dataset and the number of baselines for comparison. |
| Hardware Specification | No | Most experiments were conducted on the Satori cluster (satori.mit.edu). This statement provides a general computing environment but lacks specific hardware details such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | All models were implemented in Py Torch (Paszke et al., 2019). [...] implemented using Py Torch Geometric (Fey & Lenssen, 2019). [...] generated by using the R package bnlearn (Scutari, 2010). Specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | For DAGNN, we used hidden dimension 300. [...] We stopped training when the validation metric did not improve further under a patience of 20 epochs, for all models but D-VAE and DAGNN. For the latter two, we used a patience of 10. Moreover, for these two models we used gradient clipping (at 0.25) due to the recurrent layers and a batch size of 80. [...] For DAGNN, we started the learning rate scheduler at 1e-3 (instead of 1e-4) and stopped at a maximum number of epochs, 100 for NA and 50 for BN (instead of 300 and 100, respectively). |