Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Cross-Lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis
Authors: Zhigang Wang, Juanzi Li, Shuangjie Li, Mingyang Li, Jie Tang, Kuo Zhang, Kun Zhang
AAAI 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed approach successfully overcome the above issues, and experiments show that our approach significantly outperforms the designed state-of-the-art comparison methods. |
| Researcher Affiliation | Collaboration | Zhigang Wang Juanzi Li Shuangjie Li Mingyang Li Jie Tang Kuo Zhang Kun Zhang Department of Computer Science and Technology, Tsinghua University, Beijing, China EMAIL Sogou Incorporation, Beijing, China EMAIL |
| Pseudocode | No | The paper describes the learning process with mathematical equations and steps, but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper states: 'The data sets are available at http://xlore.org/publications.action.' This refers to data, not source code for the methodology. |
| Open Datasets | Yes | The data sets are available at http://xlore.org/publications.action. |
| Dataset Splits | Yes | To demonstrate the better generalization ability of DAB model, we conduct 2-fold crossvalidation on the labeled dataset. Besides, in each iteration, we separate the cross-lingual validated results Vt into 2 fold and add one of them into the testing dataset, and use the other part to expand the pool Pt+1. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Weka (Hall et al. 2009)' and 'Stanford Parser (Green et al. 2011)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Both the comparison methods and DAB model use the default settings of Decision Tree in Weka (Hall et al. 2009). We use the Stanford Parser (Green et al. 2011) for head word extraction. The Ada Boost and DAB methods run 20 iterations. ... the threshold θ is experimentally set as 0.93. ... The parameter δ is used to limit the update speed, where in each iteration no more than δ m examples are replaced from At. δ is experimentally set as 0.2. |