From Graphs to Hypergraphs: Hypergraph Projection and its Reconstruction

Authors: Yanbang Wang, Jon Kleinberg

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our reconstruction method is evaluated on 8 real-world datasets under different settings, and exhibits consistently good performance.
Researcher Affiliation Academia Yanbang Wang, Jon Kleinberg Department of Computer Science, Cornell University {ywangdr,kleinberg}@cs.cornell.edu
Pseudocode Yes Algorithm 1 Optimize Clique Sampler
Open Source Code Yes Our code and data are available here. Our code and data can be downloaded from https://anonymous.4open.science/r/supervised_hypergraph_reconstruction-FD0B/README.md.
Open Datasets Yes We use 8 real-world datasets from various application domains. ... All source links and data can be found in submitted code. (e.g., Enron (Benson et al., 2018a))
Dataset Splits Yes To generate a training set and a query set, we follow two common standards to split the collection of hyperedges in each dataset: (1) For datasets that come in natural segments, such as DBLP and Enron whose hyperedges are timestamped, we follow their segments so that training and query contain two disjoint and roughly equal-sized sets of hyperedges. ... (2) For all the other datasets that lack natural segments, we randomly split the set of hyperedges into halves. Regarding the tuning of β, we found the best β by training our model on 90% training data and evaluated on the rest 10% training data.
Hardware Specification Yes All experiments including model training are run on Intel Xeon Gold 6254 CPU @ 3.15GHz with 1.6TB Memory.
Software Dependencies No I did not find specific version numbers for key software components or libraries. The paper mentions 'optimize using Adam' and 'Bayesian-MDL s is written in C++, CMM in Matlab, and all other methods in Python' but lacks version details for these or other dependencies.
Experiment Setup Yes For models requiring back propagation, we use cross entropy loss and optimize using Adam for 2000 epochs and learning rate 0.0001. For CMM, we set the number of latent factors to 30. For Hyper SAGNN, we set the representation size to 64, window size to 10, walk length to 40, the number of walks per vertex to 10.