MAPS-KB: A Million-Scale Probabilistic Simile Knowledge Base
Authors: Qianyu He, Xintao Wang, Jiaqing Liang, Yanghua Xiao
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct sufficient experiments to justify the effectiveness of methods of our framework. We also apply MAPS-KB on three downstream tasks to achieve state-of-the-art performance, further demonstrating the value of MAPS-KB. |
| Researcher Affiliation | Collaboration | 1Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University 2School of Data Science, Fudan University 3Fudan-Aishu Cognitive Intelligence Joint Research Center, Shanghai, China |
| Pseudocode | Yes | Algorithm 1: Component Extraction Based on Constituency Parsing. |
| Open Source Code | Yes | Resources of MAPS-KB are publicly available at https://github.com/Abbey4799/MAPS-KB. |
| Open Datasets | Yes | We start by collecting sentences from several corpora (Table 1)... Name Size # Patternlike # Patternbe Openwebtext 38GB 1.5M 21M Gutenburg 26GB 0.5M 8M Bookcorpus 6GB 0.5M 2M overall 70GB 2M 31M... We evaluate our inference rules on three benchmark datasets: Linguistics (Roncero and de Almeida 2015), Quizzes and General Corpus (He et al. 2022). |
| Dataset Splits | Yes | The sentences are split into 8:1:1 as the train/dev/test splits. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for running the experiments. It mentions using PLMs like BERTBASE and RoBERTa LARGE, which typically run on GPUs, but no specific models are listed. |
| Software Dependencies | No | The paper mentions software components such as BERTBASE, RoBERTa LARGE, COMET, BART, and NLTK. However, it does not provide specific version numbers for any of these components or for the programming language used (e.g., Python version). |
| Experiment Setup | No | The paper mentions aspects of the setup like using BERTBASE and RoBERTa LARGE models, applying confidence score thresholds (θlike, θbe, θknowledge, θcontext), keeping top-10 predictions, and using a hyperparameter γ for vehicle length. However, specific numerical values for these thresholds, γ, or other typical hyperparameters such as learning rate, batch size, or number of epochs are not provided in the main text. |