reproducibilityindex.ai

Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding

Authors: Zemin Liu, Vincent W. Zheng, Zhou Zhao, Fanwei Zhu, Kevin Chen-Chuan Chang, Minghui Wu, Jing Ying

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proximity embedding method on three real-world public data sets, and show it outperforms the state-of-the-art baselines.
Researcher Affiliation	Academia	1 Zhejiang University, China; 2 Advanced Digital Sciences Center, Singapore; 3 Zhejiang University City College, China; 4 University of Illinois at Urbana-Champaign, USA
Pseudocode	Yes	Algorithm 1 Prox Embed Algorithm 2 Get Prox Embedding
Open Source Code	Yes	We release the code for proximity embedding1. 1https://bitbucket.org/vwz/aaai2017-proxembed/
Open Datasets	Yes	We use three real-world public data sets in our evaluation. The Linked In data set (Li, Wang, and Chang 2014)... The Facebook data set (Mc Auley and Leskovec 2012)... The DBLP data set (Wang et al. 2010)...
Dataset Splits	No	The paper explicitly states a "20% for training and the rest 80% for testing" split. However, it does not mention a distinct validation set or its proportion.
Hardware Specification	Yes	We run experiments on Linux machines with eight 2.27GHz Intel Xeon(R) CPUs and 32GB memory.
Software Dependencies	Yes	We use Theano (Team 2016) for LSTM implementation and Java jdk-1.8 for path sampling.
Experiment Setup	Yes	In the Linked In data set, we set γ = 20, ℓ= 20 for both schoolmate and colleague. In the Facebook data set... we set γ = 40, ℓ= 80 for classmate and γ = 20, ℓ= 80 for family. In the DBLP data set... we set γ = 20, ℓ= 80 for advisor and γ = 20, ℓ= 40 for advisee. In all the data sets and all the semantic classes, we set by default α = 0.3, β = 0.5 and μ = 0.0001 (except in family, μ = 0.001). We tune different d s for different data sets.