reproducibilityindex.ai

MAPS-KB: A Million-Scale Probabilistic Simile Knowledge Base

Authors: Qianyu He, Xintao Wang, Jiaqing Liang, Yanghua Xiao

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct sufficient experiments to justify the effectiveness of methods of our framework. We also apply MAPS-KB on three downstream tasks to achieve state-of-the-art performance, further demonstrating the value of MAPS-KB.
Researcher Affiliation	Collaboration	1Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University 2School of Data Science, Fudan University 3Fudan-Aishu Cognitive Intelligence Joint Research Center, Shanghai, China
Pseudocode	Yes	Algorithm 1: Component Extraction Based on Constituency Parsing.
Open Source Code	Yes	Resources of MAPS-KB are publicly available at https://github.com/Abbey4799/MAPS-KB.
Open Datasets	Yes	We start by collecting sentences from several corpora (Table 1)... Name Size # Patternlike # Patternbe Openwebtext 38GB 1.5M 21M Gutenburg 26GB 0.5M 8M Bookcorpus 6GB 0.5M 2M overall 70GB 2M 31M... We evaluate our inference rules on three benchmark datasets: Linguistics (Roncero and de Almeida 2015), Quizzes and General Corpus (He et al. 2022).
Dataset Splits	Yes	The sentences are split into 8:1:1 as the train/dev/test splits.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for running the experiments. It mentions using PLMs like BERTBASE and RoBERTa LARGE, which typically run on GPUs, but no specific models are listed.
Software Dependencies	No	The paper mentions software components such as BERTBASE, RoBERTa LARGE, COMET, BART, and NLTK. However, it does not provide specific version numbers for any of these components or for the programming language used (e.g., Python version).
Experiment Setup	No	The paper mentions aspects of the setup like using BERTBASE and RoBERTa LARGE models, applying confidence score thresholds (θlike, θbe, θknowledge, θcontext), keeping top-10 predictions, and using a hyperparameter γ for vehicle length. However, specific numerical values for these thresholds, γ, or other typical hyperparameters such as learning rate, batch size, or number of epochs are not provided in the main text.