reproducibilityindex.ai

HypeBoy: Generative Self-Supervised Representation Learning on Hypergraphs

Authors: Sunwoo Kim, Shinhwan Kang, Fanchen Bu, Soo Yong Lee, Jaemin Yoo, Kijung Shin

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that HYPEBOY learns effective general-purpose hypergraph representations. It significantly outperforms 16 baseline methods across 11 benchmark datasets. Code and datasets are available at https://github.com/kswoo97/hypeboy. (Abstract) ... We assess the generalizability of learned representations from HYPEBOY in two downstream tasks: node classification and hyperedge prediction. ... As shown in Table 1, HYPEBOY shows the best average ranking among all 18 methods. (Section 5.1)
Researcher Affiliation	Collaboration	Sunwoo Kim Shinhwan Kang Fanchen Bu Soo Yong Lee Jaemin Yoo Kijung Shin Kim Jaechul Graduate School of AI, School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) {kswoo97, shinhwan.kang, boqvezen97, syleetolow, jaemin, kijungs}@kaist.ac.kr ... This work was supported by Samsung Electronics Co., Ltd. and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2022-0-00871, Development of AI Autonomy and Knowledge Enhancement for AI Agent Collaboration) (No. 2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)).
Pseudocode	Yes	Algorithm 1 Node swapping algorithm
Open Source Code	Yes	Code and datasets are available at https://github.com/kswoo97/hypeboy.
Open Datasets	Yes	For experiments, we use 11 benchmark hypergraph datasets. ... The Cora, Citeseer, Pubmed, Cora-CA, and DBLP-P datasets are from the work of Yadati et al. ... The DBLP-A and IMDB datasets are from the work of Wang et al. (2019). ... The AMiner dataset is from the work of Zhang et al. (2019). ... The Mondelnet-40 (MN-40) dataset is from the work of Wu et al. (2015). ... The 20Newsgroups (20News) dataset is from the work of Dua et al. (2017). ... The House dataset is from the work of Chien et al. (2022).
Dataset Splits	Yes	Following Wei et al. (2022), we randomly split the nodes into training/validation/test sets with the ratio of 1%/1%/98%, respectively. ... For hyperedge prediction, we split hyperedges into training/validation/test sets by the ratio of 60%/20%/20%.
Hardware Specification	Yes	All experiments are conducted on a machine with NVIDIA RTX 8000 D6 GPUs (48GB memory) and two Intel Xeon Silver 4214R processors.
Software Dependencies	No	The paper mentions several software components like "Adam optimizer", "Uni GCNII", "GCN", and activation functions like "ReLU", but it does not specify exact version numbers for these software dependencies or libraries.
Experiment Setup	Yes	We fix the hidden dimension and dropout rate of all models as 128 and 0.5, respectively. When training any neural network for downstream tasks, we train a model for 200 epochs, and for every 10 epochs, we evaluate the validation accuracy of the model. ... For a linear evaluation protocol of node classification, we utilize a logistic classifier, with a learning rate 0.001. ... For all the supervised models, we tune the learning rate as a hyperparameter within {0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001}. ... For HYPEBOY, we tune feature augmentation magnitude px within {0.0, 0.1, 0.2, 0.3, 0.4} and hyperedge augmentaiton magnitude pe within {0.5, 0.6, 0.7, 0.8, 0.9}. We fix the learning rate and training epochs of the feature reconstruction warm-up as 0.001 and 300, respectively.