reproducibilityindex.ai

Intrinsic and Extrinsic Evaluations of Word Embeddings

Authors: Michael Zhai, Johnny Tan, Jinho Choi

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the word embedding clusters give high correlations to the synonym and hyponym sets in Word Net, and give 0.88% and 0.17% absolute improvements in accuracy to named entity recognition and part-of-speech tagging, respectively.
Researcher Affiliation	Academia	Michael Zhai, Johnny Tan, Jinho D. Choi Department of Mathematics and Computer Science Emory University Atlanta, GA 30322 {michael.zhai,johnny.tan,jinho.choi}@emory.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	All resources are available at http://github.com/emorynlp.
Open Datasets	Yes	From Word Net, sets of synonyms and hyponyms of the 100 most frequent nouns and verbs in the New York Times corpus1 are extracted and compared to the clusters generated from the word embeddings. and 1https://catalog.ldc.upenn.edu/LDC2008T19. Also: The English portion of Onto Notes 5 is used for experiments following the standard split suggested by Pradhan et al. (2013).
Dataset Splits	Yes	The English portion of Onto Notes 5 is used for experiments following the standard split suggested by Pradhan et al. (2013).
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Ada Grad is used for training statistical models' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	Ada Grad is used for training statistical models. All of the above experiments are using the maximum cluster size of 1,500. We also tested on the max cluster size of 15,000, which showed very similar results. additional experiments are conducted by concatenating the word and contextual vectors (w+c).