Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings
Authors: Christopher Morris, Gaurav Rattan, Petra Mutzel
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our intention here is to investigate the benefits of the local, sparse algorithms, both kernel and neural architectures, compared to the global, dense algorithms, and standard kernel and GNN baselines. More precisely, we address the following questions: Q1 Do the local algorithms, both kernel and neural architectures, lead to improved classification and regression scores on real-world benchmark datasets compared to global, dense algorithms and standard baselines? Q2 Does the δ-k-LWL+ lead to improved classification accuracies compared to the δ-k-LWL? Does it lead to higher computation times? Q3 Do the local algorithms prevent overfitting to the training set? Q4 How much do the local algorithms speed up the computation time compared to the non-local algorithms or dense neural architectures? The source code of all methods and evaluation procedures is available at https://www.github. com/chrsmrrs/sparsewl. Datasets To evaluate kernels, we use the following, well-known, small-scale datasets: ENZYMES [98, 13], IMDB-BINARY, IMDB-MULTI [119], NCI1, NCI109 [109], PTC_FM [53], PROTEINS [31, 13], and REDDIT-BINARY [119]. ... Results and discussion In the following we answer questions Q1 to Q4. |
| Researcher Affiliation | Academia | CERC in Data Science for Real-Time Decision-Making, Polytechnique Montréal Department of Computer Science, RWTH Aachen University Department of Computer Science, University of Bonn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code of all methods and evaluation procedures is available at https://www.github. com/chrsmrrs/sparsewl. |
| Open Datasets | Yes | Datasets To evaluate kernels, we use the following, well-known, small-scale datasets: ENZYMES [98, 13], IMDB-BINARY, IMDB-MULTI [119], NCI1, NCI109 [109], PTC_FM [53], PROTEINS [31, 13], and REDDIT-BINARY [119]. ... For the neural architectures, we used the large-scale molecular regression datasets ZINC [34, 57] and ALCHEMY [21]. ... QM9 [91, 112] regression dataset.6 All datasets can be obtained from http://www.graphlearning.io [84]. |
| Dataset Splits | No | The paper mentions 'training versus test accuracy' and refers to 'evaluation guidelines outlined in [84]' and hyperparameter selection routines in Appendix E.2, but it does not provide explicit details about the specific training/validation/test dataset splits (e.g., percentages, sample counts, or explicit standard splits) in the main text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'PYTORCH GEOMETRIC [36]' and a 'Python-wrapped C++11 preprocessing routine' but does not specify version numbers for these software components. |
| Experiment Setup | No | The paper mentions 'hyperparameter selection routines' in Appendix E.2 but does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings within the main text. |