Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation
Authors: Wenxuan Bao, Zhichen Zeng, Zhining Liu, Hanghang Tong, Jingrui He
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on synthetic and real-world datasets to evaluate our proposed Matcha from the following aspects: RQ1: How can Matcha empower TTA algorithms and handle various structure shifts on graphs? RQ2: To what extent can Matcha restore the representation quality better than other methods? |
| Researcher Affiliation | Academia | 1University of Illinois Urbana-Champaign EMAIL |
| Pseudocode | Yes | Algorithm 1 Matcha |
| Open Source Code | Yes | Our code is available at https://github.com/baowenxuan/Matcha. |
| Open Datasets | Yes | We first adopt CSBM (Deshpande et al., 2018) to generate synthetic graphs with controlled structure and attribute shifts. ... For real-world datasets, we adopt Syn-Cora (Zhu et al., 2020), Syn-Products (Zhu et al., 2020), Twitch-E (Rozemberczki et al., 2021), and OGB-Arxiv (Hu et al., 2020). |
| Dataset Splits | Yes | We use non-overlapping train-test split over nodes on Syn-Cora to avoid label leakage. ... For OGB-Arxiv, we use a subgraph consisting of papers from 1950 to 2011 as the source graph, 2011 to 2014 as the validation graph, and 2014 to 2020 as the target graph. |
| Hardware Specification | Yes | We use single Nvidia Tesla V100 with 32GB memory. However, for the majority of our experiments, the memory usage should not exceed 8GB. We switch to Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz when recording the computation time. |
| Software Dependencies | No | The paper does not explicitly mention specific software dependencies with version numbers. |
| Experiment Setup | Yes | For CSBM, Syn-Cora, Syn-Products, we use GPRGNN with K = 9. The featurizer is a linear layer, followed by a batchnorm layer, and then the GPR module. The classifier is a linear layer. The dimension for representation is 32. For Twitch-E and OGB-Arxiv, we use GPRGNN with K = 5. The dimension for representation is 8 and 128, respectively. |