reproducibilityindex.ai

Implanting Rational Knowledge into Distributed Representation at Morpheme Level

Authors: Zi Lin, Yang Liu2954-2961

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For evaluation, we validate the paradigmatic and syntagmatic relations of morpheme embeddings, and apply the obtained embeddings to word similarity measurement, achieving signiﬁcant improvements over the classical models by more than 5 Spearman scores or 8 percentage points, which shows very promising prospects for adoption of the new source of knowledge. The experimental results of the 5 models are shown in Table 10.
Researcher Affiliation	Academia	Zi Lin,1,3 Yang Liu2,3 1Department of Chinese Language and Literature, Peking University 2Institute of Computational Linguistics, Peking University 3Key Laboratory of Computational Linguistics (Ministry of Education), Peking University {zi.lin, liuyang}@pku.edu.cn
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	The paper states 'The data of morpheme embeddings and word similarity measurement is available at https://github.com/zi-lin/MC for research purpose.' This statement refers to data, not explicitly to the source code of the methodology, and thus does not meet the criteria for an unambiguous release of the methodology's source code.
Open Datasets	Yes	wordsim-296 (Jin and Wu 2012) and PKU-500 (Wu and Li 2016) are used as evaluation datasets.
Dataset Splits	No	The paper mentions training data for morpheme embeddings and test sets for evaluation ('wordsim-296' and 'PKU-500') but does not specify validation splits or other dataset partitioning details required for reproduction beyond stating the test sets.
Hardware Specification	No	No specific hardware specifications (like GPU/CPU models or cloud instance types) used for experiments were mentioned in the paper.
Software Dependencies	No	The paper mentions using 'word2vec' and 'CBOW' but does not provide specific version numbers for these or any other software dependencies, which is required for reproducibility.
Experiment Setup	Yes	For morpheme embeddings on these 54,880,628 pseudo-sentences, we set the dimension to 20 and context window size to 3 to include all the rational knowledge when the MC is the target word." and "In the experiments, the dimension is set to 50, and the context window size is set to 5." and "Eventually, 9 types of word-formation pattern in the test sets (see description below) are assigned with different weights for the morphemes, as shown in Table 9.