Deep Graph Infomax
Authors: Petar Veličković, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R Devon Hjelm
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning. |
| Researcher Affiliation | Collaboration | Petar Veliˇckovi c Department of Computer Science and Technology University of Cambridge petar.velickovic@cst.cam.ac.uk William Fedus Mila Qu ebec Artificial Intelligence Institute Google Brain liamfedus@google.com William L. Hamilton Mila Qu ebec Artificial Intelligence Institute Mc Gill University wlh@cs.mcgill.ca Pietro Li o Department of Computer Science and Technology University of Cambridge pietro.lio@cst.cam.ac.uk Yoshua Bengio Mila Qu ebec Artificial Intelligence Institute Universit e de Montr eal yoshua.bengio@mila.quebec R Devon Hjelm Microsoft Research Mila Qu ebec Artificial Intelligence Institute devon.hjelm@microsoft.com |
| Pseudocode | No | The paper lists the steps of the procedure in Section 3.4 as a numbered list, but it is not presented in a formal pseudocode or algorithm block format with keywords like 'function', 'for loop', etc. |
| Open Source Code | Yes | A reference DGI implementation may be found at https://github.com/PetarV-/DGI. |
| Open Datasets | Yes | We utilize three standard citation network benchmark datasets Cora, Citeseer and Pubmed (Sen et al., 2008) and closely follow the transductive experimental setup of Yang et al. (2016). ... We make use of a protein-protein interaction (PPI) dataset that consists of graphs corresponding to different human tissues (Zitnik & Leskovec, 2017). ... Table 1: Summary of the datasets used in our experiments. Dataset Task Nodes Edges Features Classes Train/Val/Test Nodes Cora Transductive 2,708 5,429 1,433 7 140/500/1,000 |
| Dataset Splits | Yes | Table 1: Summary of the datasets used in our experiments. Dataset Task Nodes Edges Features Classes Train/Val/Test Nodes Cora Transductive 2,708 5,429 1,433 7 140/500/1,000 Citeseer Transductive 3,327 4,732 3,703 6 120/500/1,000 Pubmed Transductive 19,717 44,338 500 3 60/500/1,000 Reddit Inductive 231,443 11,606,919 602 41 151,708/23,699/55,334 PPI Inductive 56,944 818,716 50 121 44,906/6,514/5,524 (24 graphs) (multilbl.) (20/2/2 graphs) |
| Hardware Specification | No | The paper mentions that a dataset 'will not fit into GPU memory entirely' but does not specify any particular GPU model or other hardware used for experiments. |
| Software Dependencies | No | The paper thanks the developers of Py Torch (Paszke et al., 2017) but does not provide specific version numbers for PyTorch or any other software libraries or dependencies used. |
| Experiment Setup | Yes | For the transductive learning tasks (Cora, Citeseer and Pubmed), our encoder is a one-layer Graph Convolutional Network (GCN) model (Kipf & Welling, 2016a), with the following propagation rule: E(X, A) = σ ˆD 1... For the nonlinearity, σ, we have applied the parametric Re LU (PRe LU) function (He et al., 2015), and Θ RF F is a learnable linear transformation applied to every node, with F = 512 features being computed (specially, F = 256 on Pubmed due to memory limitations). ... All models are initialized using Glorot initialization (Glorot & Bengio, 2010) and trained to maximize the mutual information provided in Equation 1 on the available nodes (all nodes for the transductive, and training nodes only in the inductive setup) using the Adam SGD optimizer (Kingma & Ba, 2014) with an initial learning rate of 0.001 (specially, 10 5 on Reddit). On the transductive datasets, we use an early stopping strategy on the observed training loss, with a patience of 20 epochs. |