Bootstrapping Domain Ontologies from Wikipedia: A Uniform Approach

Authors: Daniil Mirylenka, Andrea Passerini, Luciano Serafini

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method by generating ontology skeletons for the domains of Computing and Music. The quality of the generated ontologies has been measured against manually built ground truth datasets of several hundred nodes. In this paper we propose an automatic method based on the machine learning techniques for extracting domain ontology skeletons from the Wikipedia category hierarchy. We evaluate our method by generating ontology skeletons for the domains of Computing and Music. The quality of the generated ontologies has been measured against manually built ground truth datasets.
Researcher Affiliation Academia Daniil Mirylenka and Andrea Passerini University of Trento Via Sommarive 9, 38123, Trento, Italy {dmirylenka, passerini}@disi.unitn.it Luciano Serafini Fondazione Bruno Kessler Via Sommarive 18, 38123, Trento, Italy serafini@fbk.eu
Pseudocode Yes Algorithm 1: Selection of the relevant categories.
Open Source Code Yes The code, data, and experiments are available1 online. 1https://github.com/anonymous-ijcai/dsw-ont-ijcai
Open Datasets Yes The code, data, and experiments are available1 online. 1https://github.com/anonymous-ijcai/dsw-ont-ijcai
Dataset Splits Yes The performance of the tasks was evaluated in cross-validation, and the parameters of the classifiers were tuned with a nested cross-validation on the training parts of each fold.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using "standard L2-regularized linear SVM" and the "NLTK package" but does not specify version numbers for these software components.
Experiment Setup No The paper mentions "max depth" as a parameter for its selection algorithm and that "parameters of the classifiers were tuned with a nested cross-validation", but it does not provide specific hyperparameter values (e.g., regularization strength for SVM, learning rates) or other detailed training configurations typical for reproducing experiments.