Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Network Schema Preserving Heterogeneous Information Network Embedding
Authors: Jianan Zhao, Xiao Wang, Chuan Shi, Zekuan Liu, Yanfang Ye
IJCAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three real-world datasets demonstrate that our proposed model NSHE significantly outperforms the state-of-the-art methods. |
| Researcher Affiliation | Academia | 1School of CS, Beijing University of Posts and Telecommunications, Beijing, China 2Department of CDS, Case Western Reserve University, OH, USA |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and dataset is publicly available on Github1. |
| Open Datasets | Yes | DBLP [Lu et al., 2019]: We extract a subset of DBLP which contains 9556 papers (P), 2000 authors (A), and 20 conferences (C). [...] IMDB [Wang et al., 2019]: We extract a subset of IMDB which contains 3676 movies (M), 4353 actors (A), and 1678 directors (D). [...] ACM [Wang et al., 2019]: We extract papers published in KDD, SIGMOD, SIGCOMM, Mobi COMM, and VLDB and divide them into three classes: database, wireless communication, and data mining. |
| Dataset Splits | No | The paper states "we train a logistic classifier with 80% of the labeled nodes and use the remaining data for testing." but does not explicitly mention a separate validation split or its proportion. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments. |
| Software Dependencies | No | The paper mentions using "Adam [Kingma and Ba, 2015] algorithm" but does not specify version numbers for any software libraries, frameworks, or programming languages used for implementation. |
| Experiment Setup | Yes | For our proposed model, the feature dimension in common space and the embedding dimension d is set as 128. The negative schema instance sample rate Ms in Section 3.2 is set as 4. We perform neighborhood aggregation via an one-layer GCN, i.e., L = 1, and use two-layer-MLPs for schema instance classification. |