reproducibilityindex.ai

Ontology-Based Information Extraction with a Cognitive Agent

Authors: Peter Lindes, Deryle Lonsdale, David Embley

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	After presenting the results from several evaluations that we have carried out, we summarize possible future directions. ... We discuss performance of the system and evaluate it against a gold-standard corpus of human annotations. ... So far we have run several evaluations of Onto Soar’s performance, across different documents and while using different user ontologies. ... Table 1 shows combined precision, recall, and F-measure result set for Samples 1 and 2 when compared to human annotations. ... The results are summarized in Table 2;
Researcher Affiliation	Academia	Peter Lindes, Deryle W. Lonsdale, David W. Embley Brigham Young University Provo, Utah 84602
Pseudocode	No	The paper describes its system and processes through textual explanations and a system architecture diagram (Figure 3), but it does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions that "The LG parser is an open-source parsing tool" but this refers to a third-party tool they use, not the open-sourcing of their own Onto Soar system's code. There is no explicit statement or link indicating that the code for Onto Soar is open-source or publicly available.
Open Datasets	No	The paper mentions evaluating on "a gold-standard corpus of human annotations" and using text from "family history books" including "Vanderpoel 1902", and having "access to a private repository of over a hundred thousand such books". However, it does not provide any concrete access information (link, DOI, specific repository, or formal citation with author/year) for a publicly available dataset they used for training or evaluation.
Dataset Splits	No	The paper evaluates against a "gold-standard corpus of human annotations" and presents results for "Samples 1 and 2" and "additional texts" (12 books). While it describes the selection process of these texts, it does not specify any formal training, validation, or test dataset splits (e.g., percentages, sample counts for each split, or a cross-validation setup).
Hardware Specification	No	The paper states: "Processing took about 8 hours on a typical PC desktop." This description is too vague and lacks specific hardware details (e.g., CPU or GPU models, memory specifications) needed for reproducibility.
Software Dependencies	No	The paper mentions: "Onto Soar is built using Java components, some Java libraries, some custom Java components, the LG parser, the Soar system, and Soar code that implements all the semantic components." It names key systems like the "LG parser" and "Soar", but it does not provide specific version numbers for Java, the LG parser, the Soar system, or any other critical software libraries or dependencies, which would be necessary for reproducibility.
Experiment Setup	No	The paper describes the system's architecture and linguistic processing steps (e.g., tokenization, parsing, semantic analysis, ontology matching) in detail. However, it does not provide specific numerical experimental setup details such as hyperparameters (e.g., learning rates, batch sizes, number of epochs) or specific training configurations for its models or components.