A Generalization of ViT/MLP-Mixer to Graphs
Authors: Xiaoxin He, Bryan Hooi, Thomas Laurent, Adam Perold, Yann Lecun, Xavier Bresson
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our architecture on 4 simulated datasets and 7 real-world benchmarks, and show highly competitive results on all of them. |
| Researcher Affiliation | Collaboration | 1School of Computing, University of Singapore 2Institute of Data Science, National University of Singapore 3Loyola Marymount University 4Element, Inc. 5New York University 6Meta AI. |
| Pseudocode | No | The paper describes the proposed architecture and processes in narrative form, detailing steps such as 'Raw node and edge linear embedding' and 'Graph convolutional layers with MP-GNN', but it does not include any formally structured or labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is available for reproducibility at: https://github.com/ Xiaoxin He/Graph-Vi T-MLPMixer. |
| Open Datasets | Yes | We evaluate our Graph Vi T/MLP-Mixer on a wide range of graph benchmarks; 1) Simulated datasets: CSL, EXP, SR25 and Tree Neighbour Match dataset, 2) Small real-world datasets: ZINC, MNIST and CIFAR10 from Benchmarking GNNs (Dwivedi et al., 2020), and Mol TOX21 and Mol HIV from OGB (Hu et al., 2020) and 3) Large real-world datasets: Peptides-func and Peptides-struct from LRGB (Dwivedi et al., 2022). |
| Dataset Splits | Yes | The dataset comes with a predefined 10K/1K/1K train/validation/test split. [...] for MNIST 55K/5K/10K and for CIFAR10 45K/5K/10K train/validation/test graphs. [...] for Mol TOX21 6K/0.78K/0.78K and for Mol HIV 32K/4K/4K train/validation/test. All real world evaluated benchmarks define a standard train/validation/test dataset split. |
| Hardware Specification | Yes | We ran our experiments on NVIDIA RTX A5000 GPUs. |
| Software Dependencies | No | We implement out model using Py Torch (Paszke et al., 2019) and Py G (Fey & Lenssen, 2019). The paper mentions software packages like PyTorch and PyG along with their originating papers, but does not specify the exact version numbers used for the experiments. |
| Experiment Setup | Yes | For optimization, we use Adam (Kingma & Ba, 2014) optimizer, with the default settings of β1 = 0.9, β2 = 0.999, and ϵ = 1e 8. We use the same hyperparameter with batch size of 32 and learning rate of 0.01 without further tuning. [...] The hidden size is set to 128 and the number of layers is set to 4 by default. |