Scale Equivariant Graph Metanetworks
Authors: Ioannis Kalogeropoulos, Giorgos Bouritsas, Yannis Panagakis
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our method advances the state-of-the-art performance for several datasets and activation functions, highlighting the power of scaling symmetries as an inductive bias for NN processing. |
| Researcher Affiliation | Academia | Ioannis Kalogeropoulos1,2 Giorgos Bouritsas1,2 Yannis Panagakis1,2 1National and Kapodistrian University of Athens 2Archimedes/Athena RC, Greece |
| Pseudocode | Yes | Algorithm 1: Bias shift (elimination of symmetries due to periodicity) |
| Open Source Code | No | The source code is publicly available at https://github.com/jkalogero/scalegmn. (The question refers to the code being made available *for the work described in this paper*, and the checklist question #5 states 'The code will be made publicly available at a later stage after all the required steps are taken for proper documentation to make it easy to use for the interested user', which indicates it's not yet concretely available at the time of submission.) |
| Open Datasets | Yes | We evaluated our method on three INR datasets. Provided as open-source by Navon et al. [54], the datasets MNIST and Fashion MNIST contain a single INR for each image of the datasets MNIST [38] and Fashion MNIST [76] respectively. The selection of these datasets was encouraged by the fact that they were also selected by prior works, establishing them as a useful first benchmark on INR metanetworks. As a third INR dataset, we use CIFAR-10, publicly available by Zhou et al. [85], which contains one INR per image from CIFAR10 [35]. |
| Dataset Splits | Yes | We select the two datasets used in [85], namely CIFAR-10-GS and SVHN-GS originally from Small CNN Zoo [74]. These contain CNNs with Re LU or tanh activations, which exhibit scale and sign symmetries respectively. [...] In the first case, we split each dataset into two subsets each containing the same activation and evaluate all baselines. As shown in Table 2, once again Scale GMN outperforms all the baselines in all the examined datasets. |
| Hardware Specification | Yes | All the experiments were conducted on NVIDIA Ge Force RTX 4090. |
| Software Dependencies | No | No specific version numbers for software dependencies were mentioned, only programming language details. |
| Experiment Setup | Yes | We optimise the following hyperparameters: batch size in {64, 128, 256}, hidden dimension for node/edge features in {64, 128, 256}. We also search learning rates in {0.001, 0.0005, 0.0001}, weight decay in {0.01, 0.001}, dropout in {0, 0.1, 0.2} and number of GNN layers in {2, 3, 4, 5}. Moreover, we experiment with using only vertex positional encodings or also employing edge positional encodings. We apply layer normalization within each MLP and use skip connections between each GNN layer. The last two steps were proven valuable to stabilise training. Finally, for each MLP within the architecture we use Si LU activation function, one hidden layer and no activation function after the last layer. |