Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia

Authors: Alessandro Raganato, Claudio Delli Bovi, Roberto Navigli

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated SEW by carrying out both an intrinsic (Section 6.1) and an extrinsic evaluation (Sections 6.2 and 6.3). In the former we compared our sense annotations against those discovered by 3W [Noraset et al., 2014], a Wikipediaspecific system designed to add automatically high-precision hyperlinks to Wikipedia pages; in the latter we used SEW as a training set for Entity Linking (Section 6.2) and we exploited our propagated hyperlinks to develop Wikipedia-based language-independent vector representations for semantic similarity (Section 6.3). In both experiments of Sections 6.2 and 6.3 we compared against a baseline given by the original
Researcher Affiliation Academia Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli Department of Computer Science Sapienza University of Rome {raganato,dellibovi,navigli}@di.uniroma1.it
Pseudocode No The paper describes its approach in prose but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper provides a link to the generated corpus (http://lcl.uniroma1.it/sew) but does not state that the source code for the methodology is openly available.
Open Datasets Yes We built SEW by applying the approach described in Sections 3 and 4 to the English Wikipedia dump of November 2014. We relied on Babel Net 3.06 as sense inventory, and exploited the Stanford Core NLP pipeline7 for preprocessing.
Dataset Splits No The paper refers to a 'hand-labeled evaluation set of 2,000 randomly selected Wikipedia pages, described in [Noraset et al., 2014] and used for training, validating and testing 3W' for intrinsic evaluation, but it does not explicitly describe train/validation/test splits for its own experimental setup where SEW is used as training data.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper mentions 'Babel Net 3.06' with a version number and 'Stanford Core NLP pipeline' without a specific version number. For reproducibility, multiple key software components with their versions are required.
Experiment Setup Yes Given a category c, we first harvest all hyperlinks appearing in all Wikipedia pages in c at least twice, and then we rank them by frequency count. In order to filter out categories that are too broad or uninformative (e.g. Living people) we associate with each category c a probability distribution over hyperlinks fc, and compute the entropy H(c) of such distribution as: [...] Given a Wikipedia page p, we consider each category cp of p where H(cp) is below a predefined threshold H5, and add to Sp all the synsets that identify hyperlinks in Scp. 5we used H = 0.5 in our experiments (Section 6)