Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PROXI: Challenging the GNNs for Link Prediction

Authors: Astrit Tola, Jack Myrick, Baris Coskunuzer

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Motivated by these observations, we conduct empirical tests to compare the performance of current GNN models with more conventional and direct methods in link prediction tasks. Introducing our model, PROXI, which leverages proximity information of node pairs in both graph and attribute spaces, we find that standard machine learning (ML) models perform competitively, even outperforming cutting-edge GNN models when applied to these proximity metrics derived from node neighborhoods and attributes. This holds true across both homophilic and heterophilic networks, as well as small and large benchmark datasets, including those from the Open Graph Benchmark (OGB). Moreover, we show that augmenting traditional GNNs with PROXI significantly boosts their link prediction performance. Our empirical findings corroborate the previously mentioned theoretical observations and imply that there exists ample room for enhancement in current GNN models to reach their potential. Our code is available at https://github.com/workrep20232/PROXI
Researcher Affiliation	Academia	Astrit Tola EMAIL Department of Mathematical Sciences University of Texas at Dallas Jack Alec Myrick EMAIL Department of Computer Science University of Texas at Dallas Baris Coskunuzer EMAIL Department of Mathematical Sciences University of Texas at Dallas
Pseudocode	No	The paper describes its methodology in Section 3 and provides detailed descriptions of structural and domain proximity indices. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured, code-like steps for its methods in such a format.
Open Source Code	Yes	Our code is available at https://github.com/workrep20232/PROXI
Open Datasets	Yes	Datasets. In our experiments, we used twelve benchmark datasets for link prediction tasks. All the datasets are used in the transductive setting like most other baselines in the domain. The dataset statistics are given in Table 2. The details of the datasets are given in Appendix A. Appendix A. In our experiments, we used twelve benchmark datasets for link prediction tasks. All the datasets are used in the transductive setting like most other baselines in the domain. The dataset statistics are given in Table 2. The citation network datasets, namely CORA, CITESEER, and PUBMED are introduced in (Yang et al., 2016). In the context of co-purchasing networks, the benchmark datasets, PHOTO, and COMPUTERS, are introduced in (Shchur et al., 2018). Next, the OGBL-COLLAB dataset is a part of the library of large benchmark datasets, namely Open Graph Benchmark (OGB) collection (Hu et al., 2020; 2021a). Another OGB dataset is OGBL-PPA. The datasets WISCONSIN and TEXAS (Pei et al., 2020). Finally, the Wikipedia heterophilic webpage structures that are represented by the networks CROCODILE, CHAMELEON, and SQUIRREL (Pei et al., 2020).
Dataset Splits	Yes	Experiment Settings. To compare our model s performance, we adopted the common method proposed in Lichtenwalter et al. (2010) with 85/5/10 split for all datasets except OGB datasets which come with their own predefined training and test sets. To expand the comparison baselines, we also report the performance of our model with different split 70/10/20 in Table 9 in the Appendix.
Hardware Specification	Yes	Implementation and Runtime. We ran experiments on a single machine with 12th Generation Intel Core i7-1270P v Pro Processor (E-cores up to 3.50 GHz, P-cores up to 4.80 GHz), and 32Gb of RAM (LPDDR56400MHz).
Software Dependencies	No	The paper mentions 'XGBoost' as the primary machine learning tool but does not provide a specific version number for it or any other software dependency.
Experiment Setup	Yes	Hyperparameter Settings. In our study, XGBoost acts as the primary machine learning tool. The optimization objective is defined as rank: pairwise, with logloss operating as the evaluation metric. When assessing outcomes through the AUC metric, we configure the maximum tree depth of 5, the learning rate varying in [0.01, 0.05], the number of estimators at 1000, and the regularization parameter lambda set to 10.0. For the more demanding metric, Hits@20, modifications are implemented. Notably, the maximum tree depth is upgraded to 5, the colsample bytree ratio is adjusted to 1, the learning rate is set to 0.1, and lambda is set to 1.0, while other parameters remain consistent with those utilized for the AUC metric. Similarly, for the Hits@50 metric, the maximum tree depth is reset to 11, the learning rate is increased to 0.5, and lambda is set to 1.0, with all other hyperparameters aligned with those used for the AUC metric. Lastly, in the context of the Hits@100 metric, adaptions are made within the AUC hyperparameter setting. These adjustments encompass changing the maximum tree depth to 5, setting the learning rate to 0.3, adjusting the subsample ratio to 0.5, and the colsample bytree ratio to 1.0, while lambda is maintained at 1.0.