reproducibilityindex.ai

Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings

Authors: Christopher Morris, Gaurav Rattan, Petra Mutzel

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our intention here is to investigate the beneﬁts of the local, sparse algorithms, both kernel and neural architectures, compared to the global, dense algorithms, and standard kernel and GNN baselines. More precisely, we address the following questions: Q1 Do the local algorithms, both kernel and neural architectures, lead to improved classiﬁcation and regression scores on real-world benchmark datasets compared to global, dense algorithms and standard baselines? Q2 Does the δ-k-LWL+ lead to improved classiﬁcation accuracies compared to the δ-k-LWL? Does it lead to higher computation times? Q3 Do the local algorithms prevent overﬁtting to the training set? Q4 How much do the local algorithms speed up the computation time compared to the non-local algorithms or dense neural architectures? The source code of all methods and evaluation procedures is available at https://www.github. com/chrsmrrs/sparsewl. Datasets To evaluate kernels, we use the following, well-known, small-scale datasets: ENZYMES [98, 13], IMDB-BINARY, IMDB-MULTI [119], NCI1, NCI109 [109], PTC_FM [53], PROTEINS [31, 13], and REDDIT-BINARY [119]. ... Results and discussion In the following we answer questions Q1 to Q4.
Researcher Affiliation	Academia	CERC in Data Science for Real-Time Decision-Making, Polytechnique Montréal Department of Computer Science, RWTH Aachen University Department of Computer Science, University of Bonn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code of all methods and evaluation procedures is available at https://www.github. com/chrsmrrs/sparsewl.
Open Datasets	Yes	Datasets To evaluate kernels, we use the following, well-known, small-scale datasets: ENZYMES [98, 13], IMDB-BINARY, IMDB-MULTI [119], NCI1, NCI109 [109], PTC_FM [53], PROTEINS [31, 13], and REDDIT-BINARY [119]. ... For the neural architectures, we used the large-scale molecular regression datasets ZINC [34, 57] and ALCHEMY [21]. ... QM9 [91, 112] regression dataset.6 All datasets can be obtained from http://www.graphlearning.io [84].
Dataset Splits	No	The paper mentions 'training versus test accuracy' and refers to 'evaluation guidelines outlined in [84]' and hyperparameter selection routines in Appendix E.2, but it does not provide explicit details about the specific training/validation/test dataset splits (e.g., percentages, sample counts, or explicit standard splits) in the main text.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using 'PYTORCH GEOMETRIC [36]' and a 'Python-wrapped C++11 preprocessing routine' but does not specify version numbers for these software components.
Experiment Setup	No	The paper mentions 'hyperparameter selection routines' in Appendix E.2 but does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings within the main text.