Inductive and Unsupervised Representation Learning on Graph Structured Objects

Authors: Lichen Wang, Bo Zong, Qianqian Ma, Wei Cheng, Jingchao Ni, Wenchao Yu, Yanchi Liu, Dongjin Song, Haifeng Chen, Yun Fu

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate the effectiveness of the SEED framework via classification and clustering tasks on public benchmark datasets. We observe that graph representations generated by SEED are able to effectively capture structural information, and maintain stable performance even when the node attributes are not available. Compared with competitive baseline methods, the proposed SEED framework could achieve up to 10% improvement in prediction accuracy.
Researcher Affiliation Collaboration Lichen Wang1, Bo Zong2, Qianqian Ma3, Wei Cheng2, Jingchao Ni2, Wenchao Yu2, Yanchi Liu2, Dongjin Song2, Haifeng Chen2, and Yun Fu1 1Northeastern University, Boston, USA 2NEC Laboratories America, Princeton, USA 3Boston University, Boston, USA
Pseudocode No The paper describes methods in narrative and mathematical forms but does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code, nor does it state that the code for the described methodology is publicly available.
Open Datasets Yes We employ seven public benchmark datasets to evaluate the effectiveness of SEED. The brief introductions of the datasets are listed below. Deezer User-User Friendship Networks (Deezer) (Rozemberczki et al., 2018), Mutagenic Aromatic and Heteroaromatic Nitro Compounds (MUTAG) (Debnath et al., 1991), NCI1 (Wale et al., 2008), PROTEINS (Borgwardt et al., 2005), COLLAB (Leskovec et al., 2005), IMDB-BINARY (Yanardag & Vishwanathan, 2015), IMDB-MULTI.
Dataset Splits No The paper does not provide specific details on train/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology) needed to reproduce the data partitioning. It mentions using datasets for 'classification and clustering tasks' but not the split ratios or methods.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions various algorithms and methods (e.g., Adam, MMD, t-SNE, Deep Set) and cites their original papers, but it does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup Yes Walk length and sample numbers are two meta-parameters in the SEED framework. By adjusting these two meta-parameters, we can make trade-off between effectiveness and computational efficiency. In the experiment, we empirically evaluate the impact of the two meta-parameters on the MUTAG dataset. In Table 2, each row denotes the performance with different sampling numbers (from 25 to 800) while the walk length is fixed to 10. Moreover, we adjust the walk length from 5 to 25 while sampling number is fixed to 200 in Table 3.