Dynamic Rescaling for Training GNNs
Authors: Nimrah Mustafa, Rebekka Burkholz
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We primarily study the effect of training GAT in a balanced state based on the relative gradients criterion (see Eq.(4)), by dynamic rescaling on five real-world heterophilic benchmark datasets [32]. We explore our conceptual ideas empirically and find promising directions to utilize dynamic rescaling for more practical benefits, by training in balance or controlling order of learning among network layers. |
| Researcher Affiliation | Academia | Nimrah Mustafa CISPA 66123 Saarbrücken, Germany nimrah.mustafa@cispa.de Rebekka Burkholz CISPA 66123 Saarbrücken, Germany burkholz@cispa.de |
| Pseudocode | No | The paper describes a procedure for balancing and provides equations (Eq. 6 and 7) but does not present it in a structured pseudocode or algorithm block. |
| Open Source Code | Yes | Our experimental code is available at https://github.com/RelationalML/Dynamic_Rescaling_GAT. |
| Open Datasets | Yes | We primarily study the effect of training GAT in a balanced state based on the relative gradients criterion (see Eq.(4)), by dynamic rescaling on five real-world heterophilic benchmark datasets [32]. |
| Dataset Splits | Yes | Given the input graph G with a .75/.25/.25 train/validation/test split, we train a L = k layer GAT network with the same architecture as Mk but initialized with a looks-linear orthogonal structure which ensures that the network must learn the non-linear transformations of the target network. |
| Hardware Specification | Yes | Experiments were run on an NVIDIA RTX A6000 GPU with 50GB RAM. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'looks-linear orthogonal structure' but does not specify version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | All experiments use the Adam optimizer and networks are randomly initialized with looks-linear orthogonal structure [36, 1] unless specified otherwise. [...] A maximum of 10 iterations for the rebalancing procedure outlined in Eq. (6) and (7) were used. [...] The best learning rate from {0.01, 0.001, 0.005}. |