Learning Word Representation Considering Proximity and Ambiguity
Authors: Lin Qiu, Yong Cao, Zaiqing Nie, Yong Yu, Yong Rui
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on two text corpora created from Wikipedia documents. The big one contains 1.6 billion words from all Wikipedia documents while the small one contains 42 million words from a random sample of Wikipedia documents. We use the test set proposed in (Mikolov et al. 2013a) to measure the quality of the learned representation vectors. |
| Researcher Affiliation | Collaboration | Shanghai Jiao Tong University, {lqiu, yyu}@apex.sjtu.edu.cn Microsoft Research, {yongc, znie, yongrui}@microsoft.com |
| Pseudocode | No | The paper provides architectural diagrams and mathematical formulations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the described methodology. |
| Open Datasets | Yes | We train a CRF POS tagger on the Wall Street Journal data from Penn Treebank III (Marcus, Marcinkiewicz, and Santorini 1993). |
| Dataset Splits | No | The paper mentions using a test set from another paper and training on Wikipedia documents, but it does not explicitly specify training, validation, and test splits for its own corpora. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions using a CRF POS tagger and various algorithms (e.g., SGD, back-propagation), but it does not list specific software or library names with version numbers. |
| Experiment Setup | Yes | We conduct experiments on two text corpora created from Wikipedia documents. Three different vector dimensions (50, 300 and 600) are employed in the experiments. The initial values of the proximity weights are 1 s instead of random values. The training of each model lasts 3 epoches. We employ Medium-Grained POS in both models. The window size is set to 31. |