Modeling Transitivity and Cyclicity in Directed Graphs via Binary Code Box Embeddings

Authors: Dongxu Zhang, Michael Boratko, Cameron Musco, Andrew McCallum

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Theoretical and empirical results show that the proposed models not only preserve a useful inductive bias of transitivity but also have sufficient representational capacity to model arbitrary graphs, including graphs with cycles. We evaluate our model on graph reconstruction and link prediction tasks with various synthetic graphs and real world graphs, and observe that our proposed methods perform the best in almost all scenarios, especially when a graph has strong cyclicity and transitivity.
Researcher Affiliation Academia University of Massachusetts Amherst {dongxuzhang, mboratko, cmusco, mccallum}@cs.umass.edu
Pseudocode No The paper includes mathematical formulations and proofs but no distinct pseudocode or algorithm blocks.
Open Source Code Yes 1Our code is available at https://github.com/iesl/geometric_graph_embedding.
Open Datasets Yes We evaluate on the following real world graphs: Google hyperlink network, Epinions trust network, CORA citation network, Twitter network, and DREAM5 gene regulatory networks. For more data statistics, see Appendix E. We obtained code and data for real world graphs that was publicly available.
Dataset Splits Yes During training, all hyperparameters are tuned via 10% cross-validation and the test set results are averaged over 10 models of different random seeds, which were trained on the full training set with the best hyper-parameters found during crossvalidation.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments. The ethics checklist states: '3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]'
Software Dependencies No The paper does not list specific versions for software dependencies (e.g., Python, PyTorch versions).
Experiment Setup Yes We use Bayesian hyperparameter tuning based on the optimal F1 score for reconstruction. We sample minibatches of positive edges in T , and for each positive edge we sample 32 edges not in T by randomly corrupting either the source or target node. We also use a self-adversarial negative weight, as described in [26]. During training, all hyperparameters are tuned via 10% cross-validation. For fair comparison, we follow [30] and use 128 parameters per vertex for all our models. We also tune the weight of the sparsity regularization wr for all BC-Box models (see Section 4.5).