Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

TopER: Topological Embeddings in Graph Representation Learning

Authors: Astrit Tola, Funmilola Mary Taiwo, Cuneyt Akcora, Baris Coskunuzer

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of Top ER in classification, clustering and visualization. Our Python implementation is available at https://github.com/Astrit Tola/Top ER. [...] We conduct experiments on nine benchmark datasets for graph classification. [...] Table 2 shows the accuracy results for the given models. [...] We have conducted three ablation studies.
Researcher Affiliation	Academia	Astrit Tola Department of Mathematics Florida State University Tallahassee, FL 32306 EMAIL; Funmilola Mary Taiwo Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada EMAIL; Cuneyt Gurcan Akcora AI Institute University of Central Florida Orlando, FL, 32816 EMAIL; Baris Coskunuzer Department of Mathematical Sciences University of Texas at Dallas Richardson, TX 75080 EMAIL
Pseudocode	Yes	The full Top ER method is outlined in Algorithm 1 in the Appendix.
Open Source Code	Yes	Our Python implementation is available at https://github.com/Astrit Tola/Top ER.
Open Datasets	Yes	Datasets. We conduct experiments on nine benchmark datasets for graph classification. These are (i) the molecule graphs of BZR, and COX2 [MV09]; (ii) the biological graphs of MUTAG and PROTEINS [KM12]; and (iii) the social graphs of IMDB-Binary (IMDB-B), IMDB-Multi (IMDB-M), REDDIT-Binary (REDDIT-B), and REDDIT-Multi-5K (REDDIT-5K) [YV15]. Finally, the OGBG-MOLHIV is a large molecular property prediction dataset, part of the open graph benchmark (OGB) datasets [HFZ+20].
Dataset Splits	Yes	We employ a 90/10 train-test split, adopt the Stratifiedk Fold strategy, and present the average accuracy from ten-fold cross-validation across all our models.
Hardware Specification	Yes	Hardware. We ran experiments on a single machine with 12th Generation Intel Core i7-1270P v Pro Processor (E-cores up to 3.50 GHz, P-cores up to 4.80 GHz), and 32GB of RAM (LPDDR56400MHz).
Software Dependencies	No	The paper mentions 'Our Python implementation is available at https://github.com/Astrit Tola/Top ER.' and '3 pip install toper'. While Python and pip are mentioned, specific version numbers for Python or any critical libraries used (e.g., PyTorch, NumPy, scikit-learn, XGBoost) are not provided in the paper text.
Experiment Setup	Yes	Our proposed MLP algorithm is constructed with a single hidden layer. The output layer s activation function is set to log softmax, and the loss function we used is Negative Log Likelihood Loss. The learning rate is chosen between 0.01 and 0.001. Subsequently, we investigate the impact of the number of neurons in the hidden layer, considering values from the set {16, 64, 128}. The optimizer is set to be Adam, and the number of epochs is 500. To prevent large weights and overfitting, we apply L2 regularization coefficients of 1e-3, 1e-4. The activation function for the hidden layer varies between Re LU, Ge LU, and ELU. Lastly, we consider the cases of adding or not a batch normalization layer to the output of the hidden layer and setting dropout values to be 0.0 or 0.5. In Table 14, we provide the details for each dataset.