Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Semantical Clustering of Morphologically Related Chinese Words

Authors: Chia-Ling Lee, Ya-Ning Chang, Chao-Lin Liu, Chia-Ying Lee, Jane Yung-jen Hsu

AAAI 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Experiment 1, we employed linguistic features at the word, syntactic, semantic, and contextual levels in aggregated computational linguistics methods to handle the clustering task. In Experiment 2, we recruited adults and children to perform the clustering task. Experimental results indicate that our computational model achieved a similar level of performance as children.
Researcher Affiliation	Academia	Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan1 Institute of Linguistics, Academia Sinica, Taipei, Taiwan2 Department of Computer Science, National Chengchi University, Taipei, Taiwan3
Pseudocode	No	The paper describes methods in text but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a personal/project URL (http://www.csie.ntu.edu.tw/~r00922072/aaai14stu.html) but does not explicitly state that source code for the methodology is available there, nor is it a direct link to a code repository.
Open Datasets	Yes	We used the Academia Sinica Balanced Corpus3 as the reference corpus. 3http://rocling.iis.sinica.edu.tw/CKIP/engversion/20corpus.htm
Dataset Splits	No	The paper refers to 'Our test data and ground truth' and '11 morphological families, including 285 target words' but does not specify explicit training, validation, or test dataset splits in terms of percentages, counts, or a standard split reference.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running experiments.
Software Dependencies	No	The paper mentions using 'Stanford Parser' and 'Word Net' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	Based on heuristic, the weight of each method were determined based on its rank of individual performance (e.g., 1.0, 1.2, 1.3, 1.4)... To compute the similarity between two clusters, the average link method was adopted... F-NMI is deﬁned as α F1+(1 α) NMI where α is set to 0.5 in the current experiments.