Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding

Authors: Zemin Liu, Vincent W. Zheng, Zhou Zhao, Fanwei Zhu, Kevin Chen-Chuan Chang, Minghui Wu, Jing Ying

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proximity embedding method on three real-world public data sets, and show it outperforms the state-of-the-art baselines.
Researcher Affiliation Academia 1 Zhejiang University, China; 2 Advanced Digital Sciences Center, Singapore; 3 Zhejiang University City College, China; 4 University of Illinois at Urbana-Champaign, USA
Pseudocode Yes Algorithm 1 Prox Embed Algorithm 2 Get Prox Embedding
Open Source Code Yes We release the code for proximity embedding1. 1https://bitbucket.org/vwz/aaai2017-proxembed/
Open Datasets Yes We use three real-world public data sets in our evaluation. The Linked In data set (Li, Wang, and Chang 2014)... The Facebook data set (Mc Auley and Leskovec 2012)... The DBLP data set (Wang et al. 2010)...
Dataset Splits No The paper explicitly states a "20% for training and the rest 80% for testing" split. However, it does not mention a distinct validation set or its proportion.
Hardware Specification Yes We run experiments on Linux machines with eight 2.27GHz Intel Xeon(R) CPUs and 32GB memory.
Software Dependencies Yes We use Theano (Team 2016) for LSTM implementation and Java jdk-1.8 for path sampling.
Experiment Setup Yes In the Linked In data set, we set γ = 20, ℓ= 20 for both schoolmate and colleague. In the Facebook data set... we set γ = 40, ℓ= 80 for classmate and γ = 20, ℓ= 80 for family. In the DBLP data set... we set γ = 20, ℓ= 80 for advisor and γ = 20, ℓ= 40 for advisee. In all the data sets and all the semantic classes, we set by default α = 0.3, β = 0.5 and μ = 0.0001 (except in family, μ = 0.001). We tune different d s for different data sets.