Improving Context and Category Matching for Entity Search
Authors: Yueguo Chen, Lexi Gao, Shuming Shi, Xiaoyong Du, Ji-Rong Wen
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the INEX 2009 entity ranking task show that the proposed approach achieves a significant improvement of the entity search performance (xinf AP from 0.27 to 0.39) over the existing solutions. |
| Researcher Affiliation | Collaboration | Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China), MOE, China School of Information, Renmin University of China Microsoft Research Asia, China {chenyueguo, gaolexi, duyong, jrwen}@ruc.edu.cn, shumings@microsoft.com |
| Pseudocode | No | The paper describes models and formulas but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the methodology described. |
| Open Datasets | Yes | We adopt a public-available document collection in our experiments: Wikipedia INEX 2009 collection2 (shorted as INEX09). 2http://www.mpi-inf.mpg.de/departments/d5/software/inex/ |
| Dataset Splits | No | The paper mentions using the INEX 2009 entity ranking task dataset but does not specify the training, validation, or test dataset splits needed for reproduction. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware used for conducting the experiments. |
| Software Dependencies | No | The paper mentions using the 'Wikipedia-Miner (Milne and Witten 2008)' toolkit but does not specify version numbers for it or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | For the parameter h (L uses it for retrieving top-h relevant documents of a topic), we set h = 300 by default for LRCM and SRCM because larger h can only improve the precision very slightly. For the parameter λ, we adjust it and plot the results of L, S, LC and SC in Figure 1 respectively. It can be seen that the precision is stable for a wide range of λ. As such, we simply set λ = 0.5 for all the other experiments. For the parameter k of LCR+SCR, we find that the best performance is achieved when k = 20 in our experiments. |