On the Role of Edge Dependency in Graph Generative Models
Authors: Sudhanshu Chanpuriya, Cameron N Musco, Konstantinos Sotiropoulos, Charalampos Tsourakakis
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation, conducted on real-world datasets, focuses on assessing the output quality and overlap of our proposed models in comparison to other popular models. |
| Researcher Affiliation | Collaboration | 1University of Illinois Urbana-Champaign, Urbana, USA 2University of Massachusetts Amherst, Amherst, USA 3Meta, Menlo Park, United States 4Boston University, Boston, USA. |
| Pseudocode | Yes | Algorithm 1 Sampling Gp based on max cliques |
| Open Source Code | No | Our implementation of the methods we introduce is written in Python and uses the Num Py (Harris et al., 2020) and Sci Py (Virtanen et al., 2020) packages. Additionally, to calculate the various graph metrics, we use the following packages: MACE (MAximal Clique Enumerator) (Takeaki, 2012) and Pivoter (Jain & Seshadhri, 2020). We use the following implementation: https://github.com/kiarashza/graphvae-mm. We use the publicly available implementation by the authors of this method: https://github.com/tkipf/gae. The paper describes using other open-source code but does not provide access to the code for their own introduced models (MCEI, MCNI, MCAD). |
| Open Datasets | Yes | We use the following eight publicly available datasets (together with one synthetic dataset) that we describe next. Table 2 also provides a summary of them. CITESEER (Sen et al., 2008), CORA (Sen et al., 2008), PPI (Stark et al., 2010), POLBLOGS (Adamic & Glance, 2005), WEB-EDU (Gleich et al., 2004), LES MISERABLES (Knuth, 1993), WIKI-ELECT (Leskovec et al., 2010b), FACEBOOK-EGO (Leskovec & Mcauley, 2012), RING OF CLIQUES (synthetic dataset) |
| Dataset Splits | No | for the other methods, we use early stopping or vary the representation dimensionality. While early stopping is mentioned, there are no specific details on how datasets are split into training, validation, and test sets (e.g., percentages, methodology, or specific files). |
| Hardware Specification | No | No specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | Our implementation of the methods we introduce is written in Python and uses the Num Py (Harris et al., 2020) and Sci Py (Virtanen et al., 2020) packages. Additionally, to calculate the various graph metrics, we use the following packages: MACE (MAximal Clique Enumerator) (Takeaki, 2012) and Pivoter (Jain & Seshadhri, 2020). While software is mentioned, specific version numbers for Python, NumPy, SciPy, MACE, or Pivoter are not provided. |
| Experiment Setup | Yes | MCEI, MCNI, MCAD: We range, respectively, the probability p of including an edge of a clique, or node of a clique, or a clique in the sampled graph. We typically use evenly spaced numbers for p over the interval between 0 and 1. For graphs with larger number of maximal cliques (Pol Blogs and PPI), we additionally take the square of these values to decrease the probability. CELL: We set the dimension of the embeddings at 32 for the larger datasets and 8 for the smaller ones. [...] we stop training at every iteration and sample 10 graphs every time the overlap between the generated graphs and the input graph exceeds some value (we set 10 equally spaced thresholds between 0.05 and 0.75). Graph VAE: We use 512 dimensions for the graph-level embedding that precedes the MLP and 32 dimensions for the hidden layers of the graph autoencoder. VGAE: [...] we increase the dimensions of the hidden layers from 4 to 1024 and train for 5, 000 epochs. |