Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Representation Learning for Measuring Entity Relatedness with Rich Information
Authors: Yu Zhao, Zhiyuan Liu, Maosong Sun
IJCAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on the task of judging pair-wise word similarity. Experiment result shows that our model outperforms both traditional entity relatedness algorithms and other representation learning models. |
| Researcher Affiliation | Academia | 1 Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, National Lab for Information Science and Technology, Tsinghua University, Beijing, China 2 Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, Xuzhou 221009 China |
| Pseudocode | No | The paper describes the algorithm steps using mathematical equations and textual explanations, but does not include a formally labeled pseudocode block or algorithm figure. |
| Open Source Code | No | The paper does not explicitly state that open-source code for the methodology is provided, nor does it include a link to a code repository. |
| Open Datasets | Yes | We select the word similarity dataset Words-240 [Xiang et al., 2014]1. This dataset contains 240 pairs of Chinese words, each of which is labeled by 20 annotators, which ensures its reliability. [...] We use the same segmented Chinese Wikipedia corpus for all methods, which ensures fair comparison. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits for the Words-240 dataset. It states 'The result is evaluated against the human similarity ratings using Spearman s ρ correlation coefficient' for the whole dataset. |
| Hardware Specification | Yes | In the experiment, we use a computer with eight Intel Xeon 2.00GHz processors for parallel computing. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation. |
| Experiment Setup | Yes | In the experiment, we set the dimensionality K of the vector space to be 200. In the experiment we use λ = 0.01 and γ = δ = 0.005. In the experiment we initialize η with 0.01 and linearly decrease it after each iteration. In the experiment we set t = 50 and m = 10. |