Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding
Authors: Zemin Liu, Vincent W. Zheng, Zhou Zhao, Fanwei Zhu, Kevin Chen-Chuan Chang, Minghui Wu, Jing Ying
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proximity embedding method on three real-world public data sets, and show it outperforms the state-of-the-art baselines. |
| Researcher Affiliation | Academia | 1 Zhejiang University, China; 2 Advanced Digital Sciences Center, Singapore; 3 Zhejiang University City College, China; 4 University of Illinois at Urbana-Champaign, USA |
| Pseudocode | Yes | Algorithm 1 Prox Embed Algorithm 2 Get Prox Embedding |
| Open Source Code | Yes | We release the code for proximity embedding1. 1https://bitbucket.org/vwz/aaai2017-proxembed/ |
| Open Datasets | Yes | We use three real-world public data sets in our evaluation. The Linked In data set (Li, Wang, and Chang 2014)... The Facebook data set (Mc Auley and Leskovec 2012)... The DBLP data set (Wang et al. 2010)... |
| Dataset Splits | No | The paper explicitly states a "20% for training and the rest 80% for testing" split. However, it does not mention a distinct validation set or its proportion. |
| Hardware Specification | Yes | We run experiments on Linux machines with eight 2.27GHz Intel Xeon(R) CPUs and 32GB memory. |
| Software Dependencies | Yes | We use Theano (Team 2016) for LSTM implementation and Java jdk-1.8 for path sampling. |
| Experiment Setup | Yes | In the Linked In data set, we set γ = 20, ℓ= 20 for both schoolmate and colleague. In the Facebook data set... we set γ = 40, ℓ= 80 for classmate and γ = 20, ℓ= 80 for family. In the DBLP data set... we set γ = 20, ℓ= 80 for advisor and γ = 20, ℓ= 40 for advisee. In all the data sets and all the semantic classes, we set by default α = 0.3, β = 0.5 and μ = 0.0001 (except in family, μ = 0.001). We tune different d s for different data sets. |