Curvature Filtrations for Graph Generative Model Evaluation
Authors: Joshua Southern, Jeremy Wayland, Michael Bronstein, Bastian Rieck
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments reveal our metric is robust and expressive, thus improving upon current approaches that use simple graph descriptor and evaluator functions. |
| Researcher Affiliation | Academia | Joshua Southern Imperial College London jks17@ic.ac.uk Jeremy Wayland Helmholtz Munich & Technical University of Munich jeremy.wayland@tum.de Michael Bronstein University of Oxford michael.bronstein@cs.ox.ac.uk Bastian Rieck Helmholtz Munich & Technical University of Munich bastian.rieck@tum.de |
| Pseudocode | Yes | A Pseudocode Here we give pseudocode for various parts of the method, highlighting the most relevant aspects of using curvature filtrations to evaluate graph generative models. First, we outline a crucial part of our evaluation framework in Algorithm 1: how we compute summary topological descriptors for sets of graphs. [...] Algorithm 2, on the other hand, outlines our procedure for generating a distance between two sets of graphs using their summary topological descriptors. |
| Open Source Code | Yes | We also make our framework publicly available. Source code is available at https://github.com/aidos-lab/CFGGME. |
| Open Datasets | Yes | We evaluate discrete curvatures and their filtrations on the BREC data set, which was recently introduced to evaluate GNN expressiveness [66]. The data set consists of different categories of graph pairs (Basic, Regular, and Extension), which are distinguishable by 3-WL but not by 1-WL, as well as Strongly-Regular (STR) and CFI graph pairs, which are indistinguishable using 3-WL. [...] we generate four graphons... Sampling from these graphons produces dense graphs, and we control their size to be between 9 and 37 nodes, thus ensuring that we match the sizes of molecular graphs in the ZINC data set [32], an important application for generative models. [...] We randomly sample ten graphs from four different bioinformatics data sets, i.e. KKI, PROTEINS, Peking, and ENZYMES [51]. |
| Dataset Splits | No | The paper mentions using specific datasets for experiments and refers to a "pre-defined test split" for one experiment, but it does not provide explicit details about how data was partitioned into training, validation, and test sets (e.g., percentages, sample counts, or a detailed splitting methodology) for all experiments to ensure reproducibility. |
| Hardware Specification | No | The paper discusses computational complexity and provides tables with computation times for different graph sizes and numbers of graphs. However, it does not specify any particular hardware components such as GPU models (e.g., NVIDIA A100), CPU models (e.g., Intel Xeon), or cloud instance types used for these computations. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. It describes the methodology and algorithms but without detailing the software environment required for reproduction. |
| Experiment Setup | No | The paper describes the overall experimental design and evaluation procedures, including comparisons to other methods and analyses on different datasets. However, it does not provide specific details on the experimental setup such as hyperparameters (e.g., learning rates, batch sizes, epochs for any models used, like the 1-layer MLP mentioned in Section 4.4) or system-level training configurations, which are necessary for full reproducibility. |