Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Medical Synonym Extraction with Concept Space Models

Authors: Chang Wang, Liangliang Cao, Bowen Zhou

IJCAI 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on a dataset with more than 1M term pairs show that the proposed approach outperforms the baseline approaches by a large margin. The experimental results show that our synonym extraction models are fast and outperform the state-of-the-art approaches on medical synonym extraction by a large margin.
Researcher Affiliation	Industry	Chang Wang and Liangliang Cao and Bowen Zhou IBM T. J. Watson Research Lab 1101 Kitchawan Rd Yorktown Heights, New York 10598 EMAIL, EMAIL
Pseudocode	No	The paper presents mathematical derivations for update rules (equations 1 and 2) and defines notations (Figure 2), but it does not include a block explicitly labeled 'Pseudocode' or 'Algorithm' showing step-by-step procedures.
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the described methodology or a link to a code repository.
Open Datasets	Yes	Our medical corpus has incorporated a set of Wikipedia articles and MEDLINE abstracts (2013 version)1. We also complemented these sources with around 20 medical journals and books like Merck Manual of Diagnosis and Therapy. In total, the corpus contains about 130M sentences (about 20G pure text), and about 15M distinct terms in the vocabulary set. The UMLS 2012 Release contains more than 2.7 million concepts from over 160 source vocabularies. 1http://www.nlm.nih.gov/bsd/pmresources.html [Lindberg et al., 1993] D. Lindberg, B. Humphreys, and A. Mc Cray. The Uniﬁed Medical Language System. Methods of Information in Medicine, 32:281 291, 1993.
Dataset Splits	Yes	The ﬁnal dataset was split into 3 parts: 60% examples were used for training, 20% were used for testing the classiﬁers, and the remaining 20% were held out to evaluate the knowledgebase construction results.
Hardware Specification	Yes	It took on average several hours to generate the word embedding ﬁle from our medical corpus with 20G text using 16 3.2G cpus and roughly 30 minutes to ﬁnish the training process using one cpu.
Software Dependencies	No	The paper mentions software like 'Word2Vec model', 'liblinear package', and 'Medical ESG parser', but it does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	The parameters used in the experiments were: dimension size=100, window size=5, negative=10, and sample rate=1e-5. In all the experiments, the weight for the positive examples was set to 100, due to the fact that most of the input examples were negative.