Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification
Authors: Sonny Achten, Francesco Tonin, Panagiotis Patrinos, Johan A.K. Suykens
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the effectiveness of the proposed framework. Experiments Datasets and Main Setting As datasets, we use four homophilious graphs: Cora, Cite Seer, Pub Med (Sen et al. 2008; Yang, Cohen, and Salakhudinov 2016), and OGBArxiv (Hu et al. 2020), which are citation graphs, as well as two heterophilious graphs: Chameleon and Squirrel (Wikipedia graphs (Rozemberczki, Allen, and Sarkar 2021)). |
| Researcher Affiliation | Academia | Sonny Achten, Francesco Tonin, Panagiotis Patrinos, Johan A. K. Suykens KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics EMAIL |
| Pseudocode | Yes | Algorithm 1 Optimization algorithm of GCKM. 1: Initialize {H(1) 0 , H(2) 0 , H(3) 0 } 2: for k 0, 1, . . . , T do 3: Compute K(1) c from aggregated X 4: Compute K(2) c from aggregated H(1) k 5: Update {H(1) k+1, H(2) k+1} Cayley Adam(JGCKM) 6: Compute K(3) from H(2) k+1 7: Update {H(3) k+1} Solve (11) with K(3) |
| Open Source Code | Yes | The reported results can be reproduced using our code on Git Hub1 and the Appendix is available at Ar Xiv2. 1https://github.com/sonnyachten/GCKM |
| Open Datasets | Yes | As datasets, we use four homophilious graphs: Cora, Cite Seer, Pub Med (Sen et al. 2008; Yang, Cohen, and Salakhudinov 2016), and OGBArxiv (Hu et al. 2020), which are citation graphs, as well as two heterophilious graphs: Chameleon and Squirrel (Wikipedia graphs (Rozemberczki, Allen, and Sarkar 2021)). |
| Dataset Splits | Yes | For Cora, Cite Seer, and Pub Med, there are 4 training labels per class, 100 labels for validation, and 1000 labels for testing. For Chameleon and Squirrel, a 0.5%/0.5%/99% train/validation/test-split is used. For OGB-Arxiv, we use a 2.5%/2.5%/95% random split |
| Hardware Specification | No | We also thank the Flemish Supercomputer (VSC). |
| Software Dependencies | No | We therefore employ the Cayley Adam optimizer (Li, Li, and Todorovic 2019) to update H(1), H(2) with H(3) fixed. |
| Experiment Setup | Yes | We used GCN aggregation for the citation networks and sum aggregation for the heterophilious graphs. We used a deep GCKM with only two unsupervised layers, with RBF bandwidth Ī2 RBF = mĪ2 with m the input dimension and Ī2 the variance of the inputs, and clustering obtained by k-means on H(2). |