reproducibilityindex.ai

Disentangled Continual Graph Neural Architecture Search with Invariant Modular Supernet

Authors: Zeyang Zhang, Xin Wang, Yijian Qin, Hong Chen, Ziwei Zhang, Xu Chu, Wenwu Zhu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our method achieves state-of-the-art performance against baselines in continual graph neural architecture search.
Researcher Affiliation	Academia	1Department of Computer Science and Technology, BNRIST, Tsinghua University, Beijing, China.
Pseudocode	Yes	Algorithm 1 The pipeline for GASIM Require: The number of tasks T, hyperparameter λ, K. 1: Construct the modular super-network in Sec. 4.1. 2: for l = 1, . . . , T do 3: Predict the latent factors as Eq. (8) 4: Route the task to the module as Eq. (10) 5: Calculate routing loss as Eq. (11) 6: Calculate invariance loss as Eq. (13) 7: Calculate the final loss as Eq. (14) 8: Search the architecture according to Eq. (15) 9: Fix the architecture and finetune the weights 10: end for
Open Source Code	No	The paper does not provide any explicit statements about releasing code or links to a code repository for the described methodology.
Open Datasets	Yes	Cora Full (Mc Callum et al., 2000), Arxiv (Hu et al., 2020), and Reddit (Hamilton et al., 2017).
Dataset Splits	Yes	All datasets are partitioned into a set of tasks, each focusing on the node classification problem, where each task involves nodes from two distinct classes within an incoming graph. For each task, 60% of the nodes are allocated for training, 20% for validation, and 20% for testing.
Hardware Specification	Yes	CPU: Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz GPU: NVIDIA Ge Force RTX 4090 with 24 GB of memory
Software Dependencies	Yes	Software: Python 3.8.18, Cuda 12.2, Py Torch (Paszke et al., 2019) 2.1.2, Py Torch Geometric (Fey & Lenssen, 2019) 2.4.0.
Experiment Setup	Yes	For fair comparisons, all methods adopt the same dimensionality d as 512, number of layers as 2. Adam optimizer (Kingma & Ba, 2014) is adopted to optimize the model weights with a learning rate 1e-3 and another SGD optimizer with a learning rate 1e-2 is adopted to optimize architecture parameters for NAS methods. For our method, we adopt K = 3 for all datasets, and the hyperparameter λ {0.01, 0.1, 1, 10, 100}.