Deep Graph Infomax

Authors: Petar Veličković, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R Devon Hjelm

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning.
Researcher Affiliation Collaboration Petar Veliˇckovi c Department of Computer Science and Technology University of Cambridge petar.velickovic@cst.cam.ac.uk William Fedus Mila Qu ebec Artificial Intelligence Institute Google Brain liamfedus@google.com William L. Hamilton Mila Qu ebec Artificial Intelligence Institute Mc Gill University wlh@cs.mcgill.ca Pietro Li o Department of Computer Science and Technology University of Cambridge pietro.lio@cst.cam.ac.uk Yoshua Bengio Mila Qu ebec Artificial Intelligence Institute Universit e de Montr eal yoshua.bengio@mila.quebec R Devon Hjelm Microsoft Research Mila Qu ebec Artificial Intelligence Institute devon.hjelm@microsoft.com
Pseudocode No The paper lists the steps of the procedure in Section 3.4 as a numbered list, but it is not presented in a formal pseudocode or algorithm block format with keywords like 'function', 'for loop', etc.
Open Source Code Yes A reference DGI implementation may be found at https://github.com/PetarV-/DGI.
Open Datasets Yes We utilize three standard citation network benchmark datasets Cora, Citeseer and Pubmed (Sen et al., 2008) and closely follow the transductive experimental setup of Yang et al. (2016). ... We make use of a protein-protein interaction (PPI) dataset that consists of graphs corresponding to different human tissues (Zitnik & Leskovec, 2017). ... Table 1: Summary of the datasets used in our experiments. Dataset Task Nodes Edges Features Classes Train/Val/Test Nodes Cora Transductive 2,708 5,429 1,433 7 140/500/1,000
Dataset Splits Yes Table 1: Summary of the datasets used in our experiments. Dataset Task Nodes Edges Features Classes Train/Val/Test Nodes Cora Transductive 2,708 5,429 1,433 7 140/500/1,000 Citeseer Transductive 3,327 4,732 3,703 6 120/500/1,000 Pubmed Transductive 19,717 44,338 500 3 60/500/1,000 Reddit Inductive 231,443 11,606,919 602 41 151,708/23,699/55,334 PPI Inductive 56,944 818,716 50 121 44,906/6,514/5,524 (24 graphs) (multilbl.) (20/2/2 graphs)
Hardware Specification No The paper mentions that a dataset 'will not fit into GPU memory entirely' but does not specify any particular GPU model or other hardware used for experiments.
Software Dependencies No The paper thanks the developers of Py Torch (Paszke et al., 2017) but does not provide specific version numbers for PyTorch or any other software libraries or dependencies used.
Experiment Setup Yes For the transductive learning tasks (Cora, Citeseer and Pubmed), our encoder is a one-layer Graph Convolutional Network (GCN) model (Kipf & Welling, 2016a), with the following propagation rule: E(X, A) = σ ˆD 1... For the nonlinearity, σ, we have applied the parametric Re LU (PRe LU) function (He et al., 2015), and Θ RF F is a learnable linear transformation applied to every node, with F = 512 features being computed (specially, F = 256 on Pubmed due to memory limitations). ... All models are initialized using Glorot initialization (Glorot & Bengio, 2010) and trained to maximize the mutual information provided in Equation 1 on the available nodes (all nodes for the transductive, and training nodes only in the inductive setup) using the Adam SGD optimizer (Kingma & Ba, 2014) with an initial learning rate of 0.001 (specially, 10 5 on Reddit). On the transductive datasets, we use an early stopping strategy on the observed training loss, with a patience of 20 epochs.