Capturing Graphs with Hypo-Elliptic Diffusions

Authors: Csaba Toth, Darrick Lee, Celia Hacker, Harald Oberhauser

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Besides the attractive theoretical properties, our experiments show that this method competes with graph transformers on datasets requiring long-range reasoning but scales only linearly in the number of edges as opposed to quadratically in nodes. and Finally, Section 5 provides experiments and benchmarks.
Researcher Affiliation Academia Csaba Toth Mathematical Institute University of Oxford toth@maths.ox.ac.uk Darrick Lee Mathematical Institute University of Oxford leed@maths.ox.ac.uk Celia Hacker Department of Mathematics EPFL celia.hacker@epfl.ch Harald Oberhauser Mathematical Institute University of Oxford oberhauser@maths.ox.ac.uk
Pseudocode Yes Theorem 3. Let be as in (14) and define fk,m 2 Rn for m = 1, . . . , M as... Overall, Eq. (15) computes fk,m(i) for all i 2 V, k = 1, . . . , K, m = 1, . . . , M in O(K M 2 NE + M NE d) operations, where NE 2 N denotes the number of edges; see App. F. In particular, one does not need to compute Φ(i) 2 H directly or store large tensors. and we also provide a pseudocode implementation [in Appendix F].
Open Source Code Yes The code is implemented in PyTorch and is publicly available.
Open Datasets Yes Datasets. We use two biological graph classification datasets (NCI1 and NCI109), that contain around 4000 biochemical compounds represented as graphs with 30 nodes on average [67, 1].
Dataset Splits Yes The dataset is split in a ratio of 80% 10% 10% for training, validation and testing. and Data splits (80/10/10) for NCI1 and NCI109 are taken from [70].
Hardware Specification Yes All experiments were performed on a single Nvidia A100 GPU with 40GB of memory.
Software Dependencies Yes Our code is built upon the Spektral library [25] and PyTorch [56]. We recommend Python version 3.8.10 or higher. The models were tested with PyTorch version 1.10.2.
Experiment Setup Yes Training is performed by minimizing the categorical cross-entropy loss with an 2 regularization penalty of 10 4. For optimization, Adam [32] is used with a batch size of 128 and an inital learning rate of 10 3 that is decayed via a cosine annealing schedule [42] over 200 epochs. and All models were trained for 200 epochs, with a cosine annealing learning rate schedule [42] using an initial learning rate of 10-3 and Adam optimizer [32] with 2 regularization (weight decay) of 10-4 and batch size 128. Random seeds were set for all experiments to 0, 1, ..., 9.