Intrinsic and Extrinsic Evaluations of Word Embeddings
Authors: Michael Zhai, Johnny Tan, Jinho Choi
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the word embedding clusters give high correlations to the synonym and hyponym sets in Word Net, and give 0.88% and 0.17% absolute improvements in accuracy to named entity recognition and part-of-speech tagging, respectively. |
| Researcher Affiliation | Academia | Michael Zhai, Johnny Tan, Jinho D. Choi Department of Mathematics and Computer Science Emory University Atlanta, GA 30322 {michael.zhai,johnny.tan,jinho.choi}@emory.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All resources are available at http://github.com/emorynlp. |
| Open Datasets | Yes | From Word Net, sets of synonyms and hyponyms of the 100 most frequent nouns and verbs in the New York Times corpus1 are extracted and compared to the clusters generated from the word embeddings. and 1https://catalog.ldc.upenn.edu/LDC2008T19. Also: The English portion of Onto Notes 5 is used for experiments following the standard split suggested by Pradhan et al. (2013). |
| Dataset Splits | Yes | The English portion of Onto Notes 5 is used for experiments following the standard split suggested by Pradhan et al. (2013). |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Ada Grad is used for training statistical models' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Ada Grad is used for training statistical models. All of the above experiments are using the maximum cluster size of 1,500. We also tested on the max cluster size of 15,000, which showed very similar results. additional experiments are conducted by concatenating the word and contextual vectors (w+c). |