A Generalization of ViT/MLP-Mixer to Graphs

Authors: Xiaoxin He, Bryan Hooi, Thomas Laurent, Adam Perold, Yann Lecun, Xavier Bresson

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our architecture on 4 simulated datasets and 7 real-world benchmarks, and show highly competitive results on all of them.
Researcher Affiliation Collaboration 1School of Computing, University of Singapore 2Institute of Data Science, National University of Singapore 3Loyola Marymount University 4Element, Inc. 5New York University 6Meta AI.
Pseudocode No The paper describes the proposed architecture and processes in narrative form, detailing steps such as 'Raw node and edge linear embedding' and 'Graph convolutional layers with MP-GNN', but it does not include any formally structured or labeled pseudocode or algorithm blocks.
Open Source Code Yes The source code is available for reproducibility at: https://github.com/ Xiaoxin He/Graph-Vi T-MLPMixer.
Open Datasets Yes We evaluate our Graph Vi T/MLP-Mixer on a wide range of graph benchmarks; 1) Simulated datasets: CSL, EXP, SR25 and Tree Neighbour Match dataset, 2) Small real-world datasets: ZINC, MNIST and CIFAR10 from Benchmarking GNNs (Dwivedi et al., 2020), and Mol TOX21 and Mol HIV from OGB (Hu et al., 2020) and 3) Large real-world datasets: Peptides-func and Peptides-struct from LRGB (Dwivedi et al., 2022).
Dataset Splits Yes The dataset comes with a predefined 10K/1K/1K train/validation/test split. [...] for MNIST 55K/5K/10K and for CIFAR10 45K/5K/10K train/validation/test graphs. [...] for Mol TOX21 6K/0.78K/0.78K and for Mol HIV 32K/4K/4K train/validation/test. All real world evaluated benchmarks define a standard train/validation/test dataset split.
Hardware Specification Yes We ran our experiments on NVIDIA RTX A5000 GPUs.
Software Dependencies No We implement out model using Py Torch (Paszke et al., 2019) and Py G (Fey & Lenssen, 2019). The paper mentions software packages like PyTorch and PyG along with their originating papers, but does not specify the exact version numbers used for the experiments.
Experiment Setup Yes For optimization, we use Adam (Kingma & Ba, 2014) optimizer, with the default settings of β1 = 0.9, β2 = 0.999, and ϵ = 1e 8. We use the same hyperparameter with batch size of 32 and learning rate of 0.01 without further tuning. [...] The hidden size is set to 128 and the number of layers is set to 4 by default.