reproducibilityindex.ai

WL meet VC

Authors: Christopher Morris, Floris Geerts, Jan Tönshoff, Martin Grohe

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical study confirms the validity of our theoretical findings. ... 5. Experimental evaluation In the following, we investigate how well the VC dimension bounds from the previous section hold in practice. Specifically, we answer the following questions. Q1 How does the number of parameters influence GNNs generalization performance? Q2 How does the number of 1-WL-distinguishable graphs influence GNNs generalization performance? Q3 How does the bitlength influence a GNN s ability to fit random data?
Researcher Affiliation	Academia	1Department of Computer Science, RWTH Aachen University, Aachen, Germany 2Department of Computer Science, University of Antwerp, Antwerp, Belgium. Correspondence to: Christopher Morris <morris@cs.rwth-aachen.de>.
Pseudocode	No	The paper describes algorithms and models using mathematical equations and textual descriptions but does not include explicit pseudocode blocks or algorithm listings.
Open Source Code	Yes	The source code of all methods and evaluation procedures is available at https://www.github.com/ chrsmrrs/wl_vs_vc.
Open Datasets	Yes	To investigate questions Q1 and Q2, we used the datasets ENZYMES (Borgwardt et al., 2005; Schomburg et al., 2004), MCF-7 (Yan et al., 2008), MCF-7H (Yan et al., 2008), MUTAGENICITY (Kazius et al., 2005; Riesen & Bunke, 2008), and NCI1 and NCI109 (Wale et al., 2008; Shervashidze et al., 2011) provided by Morris et al. (2020a). ... For the experiments regarding Q1 and Q2, we uniformly and at random choose 90% of a dataset for training and the remaining 10% for testing.
Dataset Splits	No	For the experiments regarding Q1 and Q2, we uniformly and at random choose 90% of a dataset for training and the remaining 10% for testing. The paper specifies a train/test split but does not explicitly mention a separate validation split or its proportion.
Hardware Specification	Yes	All architectures were implemented using PYTORCH GEOMETRIC (Fey & Lenssen, 2019) and executed on a workstation with 128GB RAM and an NVIDIA Tesla V100 with 32GB memory.
Software Dependencies	No	All architectures were implemented using PYTORCH GEOMETRIC (Fey & Lenssen, 2019). While it mentions a software package and cites a paper, it does not provide a specific version number for PyTorch Geometric or any other software dependency.
Experiment Setup	Yes	For the experiments regarding Q1 and Q2, we fixed the number of layers to five and chose the feature dimension d in {4, 16, 256, 1 024}. To answer Q2, we set the feature dimension d to 64 and choose the number of layers from {0, . . . , 6}. We used sum pooling and a two-layer MLP for all experiments for the final classification. ... We optimized the standard cross entropy loss for 500 epochs using the ADAM optimizer (Kingma & Ba, 2015). Moreover, we used a learning rate of 0.001 across all experiments and no learning rate decay or dropout. For Q3, we set the learning rate to 10 4 and the number of epochs to 100 000, and repeated each experiment 50 times.